Requirements & Installation

Hardware & software requirements for using FedPilot.


System Requirements

While FedPilot can dynamically utilize available resources to scale experiments, it still imposes baseline hardware and software requirements.

Minimum System Requirements

  • RAM: 8GB for basic experiments
  • Storage: 20GB free space for dependencies and models
  • CPU: Multi-core processor (4+ cores recommended, Intel 8th generation or equivalent AMD CPU)

Operating System Support

  • Linux
  • macOS
  • Windows (via WSL2 recommended)

Optional Requirements

  • Docker: For containerized deployment
  • Ray Cluster: For distributed training
  • Prometheus + Grafana: For monitoring
  • tmux: For running training in a detached session (the make run target uses tmux if available)

Python Environment

Required Python Version

  • Python 3.12

FedPilot is developed and tested against Python 3.12. Other versions may work but are not officially supported.

Package Managers

  • uv (recommended; used by the Makefile)
  • pip 20.0+ (underlying installer used by uv)
  • conda (optional, for advanced environment management)

Core Dependencies

Show full dependency tables


Core Libraries

Package Version Purpose
torch 2.7.1 Deep learning framework
torchvision 0.22.1 Vision models and datasets
transformers 4.53.0 Pre-trained models (BERT, etc.)
ray 2.47.1 Distributed computing
numpy 1.26.4 Numerical computing
scipy 1.16.0 Scientific computing
pandas 2.2.3 Data analysis
scikit-learn 1.6.1 Machine learning utilities

Monitoring & Tracing

Package Version Purpose
opentelemetry-api 1.25.0+ Distributed tracing
opentelemetry-sdk 1.25.0+ Tracing implementation
prometheus_client 0.22.1 Metrics collection
tensorboardX 2.6.4 Experiment tracking

These versions are pinned in pyproject.toml and installed via uv sync (typically through make setup).


GPU Acceleration

Show CUDA version and driver requirements


CUDA Requirements

  • CUDA 11.8 or CUDA 12.1 (compatible with PyTorch 2.7.1)
  • cuDNN 8.9.0+ for optimized deep learning operations
  • NVIDIA drivers:

    • 525.60.13+ for CUDA 11.8
    • 535.86.10+ for CUDA 12.1

Installation

FedPilot ships with a Make-based CLI. The commands shown below assume that:

  • make is available in your shell
  • uv is installed (or can be installed) for Python dependency management

Clone the Repository

git clone https://github.com/fedpilot/core
cd core

Installing dependencies using uv

uv is a modern package and project manager that replaces many existing Python tools (such as pip, pip-tools, pipx, poetry, pyenv, twine, virtualenv, and others).

To install uv, you can run:

pip install uv

or follow the binary installation guide at:

https://docs.astral.sh/uv/getting-started/installation/

Once uv is installed, the recommended way to install dependencies is:

make setup

This will:

  • Check that uv is available
  • Run uv sync to install and lock Python dependencies
Show example output
FedPilot setup
Checking for uv...
uv detected: /home/Disquiet/.local/bin/uv

Running uv sync to install and lock Python dependencies...
Using CPython 3.12.8
Creating virtual environment at: .venv
Resolved 131 packages in 0.80ms
Installed 129 packages in 199ms
 + accelerate==1.11.0
 + aiohappyeyeballs==2.6.1
 + aiohttp==3.13.2
 + aiohttp-cors==0.8.1
 + aiosignal==1.4.0
 + annotated-types==0.7.0
 + attrs==23.2.0
 + cachetools==6.2.1
 + certifi==2025.6.15
 + cffi==1.17.1
 + charset-normalizer==3.4.2
 + click==8.1.6
 + colorful==0.5.8
 + contourpy==1.3.2
 + cryptography==41.0.7
 + cycler==0.12.1
 + datasets==2.14.4
 + dill==0.3.7
 + distlib==0.4.0
 + distro==1.9.0
 + docutils==0.21.2
 + filelock==3.13.1
 + fonttools==4.58.5
 + frozenlist==1.8.0
 + fsspec==2025.5.1
 + google-api-core==2.28.1
 + google-auth==2.43.0
 + googleapis-common-protos==1.72.0
 + grpcio==1.74.0
 + hf-xet==1.1.5
 + huggingface-hub==0.33.2
 + idna==3.10
 + importlib-metadata==8.7.0
 + jinja2==3.1.2
 + joblib==1.5.1
 + jsonschema==4.24.0
 + jsonschema-specifications==2025.4.1
 + kiwisolver==1.4.8
 + markdown-it-py==3.0.0
 + markupsafe==3.0.2
 + matplotlib==3.10.3
 + mdurl==0.1.2
 + mpmath==1.3.0
 + msgpack==1.1.1
 + multidict==6.7.0
 + multiprocess==0.70.15
 + networkx==3.5
 + numpy==1.26.4
 + nvidia-cublas-cu12==12.8.4.1
 + nvidia-cuda-cupti-cu12==12.8.90
 + nvidia-cuda-nvrtc-cu12==12.8.93
 + nvidia-cuda-runtime-cu12==12.8.90
 + nvidia-cudnn-cu12==9.10.2.21
 + nvidia-cufft-cu12==11.3.3.83
 + nvidia-cufile-cu12==1.13.1.3
 + nvidia-curand-cu12==10.3.9.90
 + nvidia-cusolver-cu12==11.7.3.90
 + nvidia-cusparse-cu12==12.5.8.93
 + nvidia-cusparselt-cu12==0.7.1
 + nvidia-nccl-cu12==2.27.5
 + nvidia-nvjitlink-cu12==12.8.93
 + nvidia-nvshmem-cu12==3.3.20
 + nvidia-nvtx-cu12==12.8.90
 + opencensus==0.11.4
 + opencensus-context==0.1.3
 + opentelemetry-api==1.38.0
 + opentelemetry-exporter-otlp-proto-common==1.38.0
 + opentelemetry-exporter-otlp-proto-grpc==1.38.0
 + opentelemetry-exporter-prometheus==0.59b0
 + opentelemetry-proto==1.38.0
 + opentelemetry-sdk==1.38.0
 + opentelemetry-semantic-conventions==0.59b0
 + packaging==25.0
 + pandas==2.2.3
 + peft==0.17.1
 + pillow==10.2.0
 + platformdirs==4.5.0
 + prometheus-client==0.22.1
 + propcache==0.4.1
 + proto-plus==1.26.1
 + protobuf==5.29.3
 + psutil==7.1.3
 + py-spy==0.4.1
 + pyarrow==20.0.0
 + pyasn1==0.6.1
 + pyasn1-modules==0.4.2
 + pycparser==2.22
 + pydantic==2.7.4
 + pydantic-core==2.18.4
 + pygments==2.17.2
 + pyopenssl==23.2.0
 + pyparsing==3.2.3
 + python-dateutil==2.9.0.post0
 + pytz==2025.2
 + pyyaml==6.0.1
 + ray==2.51.1
 + referencing==0.36.2
 + regex==2024.11.6
 + requests==2.32.4
 + rich==13.7.1
 + rpds-py==0.26.0
 + rsa==4.9.1
 + safetensors==0.5.3
 + scikit-learn==1.6.1
 + scipy==1.16.0
 + setuptools==80.9.0
 + shellingham==1.5.4
 + six==1.17.0
 + smart-open==7.4.4
 + sympy==1.14.0
 + tensorboardx==2.6.4
 + threadpoolctl==3.6.0
 + timm==1.0.15
 + tokenizers==0.21.2
 + torch==2.9.0
 + torchvision==0.24.0
 + tqdm==4.67.1
 + transformers==4.53.0
 + triton==3.5.0
 + typer==0.15.3
 + typing-extensions==4.10.0
 + tzdata==2025.2
 + urllib3==2.5.0
 + virtualenv==20.35.4
 + wheel==0.42.0
 + wrapt==2.0.1
 + xxhash==3.6.0
 + yarl==1.22.0
 + zipp==3.23.0

uv sync completed successfully.

Virtual environment activation hints
  On Linux/macOS (bash/zsh) Run the following:   source .venv/bin/activate
use make validate-setup to make sure everything is installed correctly

If you prefer to run uv directly, you can still do:

uv sync

from the project root. This creates a virtual environment (by default .venv/) and installs all dependencies listed in pyproject.toml. On some systems uv may use a different environment location; see the official uv documentation if .venv/ is not created.

Using a virtual environment is recommended to avoid dependency conflicts. If you used uv sync (either directly or via make setup), a virtual environment is already created.

You can activate it using:

# On Linux/macOS
source .venv/bin/activate

# On Windows (cmd)
.venv\Scripts\activate.bat

If uv created an environment in a different location, consult:

https://docs.astral.sh/uv/guides/environments/

for how to inspect and activate that environment.

Before running any experiments, it is recommended to validate your setup:

make validate-setup

This command:

  • Ensures core directories (logs, models, results, experiments) exist
  • Checks for key tools (uv, tmux, ray)
  • Reports Python version and core package availability (torch, numpy, yaml)
  • Summarizes basic GPU / CUDA visibility (if available)

Resolve any errors or warnings here before proceeding to training.

Run the Project

The Make-based CLI is the preferred entry point.

After installation and environment validation, you can launch a full training workflow with:

make train

This will:

  • Guide you through interactive configuration selection (make config internally)
  • Start training using python main.py (wrapped in tmux if available)

If you already have a config.yaml in the project root and just want to run it:

make run

For direct execution without the Makefile helpers, you can also run:

python main.py
# or, if you prefer using uv:
uv run main.py

Verify Installation

Show detailed verification commands


First, use the built-in environment check:

# Validate tools, directories, Python, and GPU visibility
make validate-setup

You can additionally verify core Python dependencies manually:

python --version 

# Check core dependencies
python -c "import torch; print(f'PyTorch: {torch.__version__}')"
python -c "import ray; ray.init(); print('Ray initialized successfully')"
python -c "import transformers; print(f'Transformers: {transformers.__version__}')"

# Check CUDA availability (if GPU)
python -c "import torch; print(f'CUDA Available: {torch.cuda.is_available()}')"
python -c "import torch; print(f'GPU Count: {torch.cuda.device_count()}')"

If you have a config.yaml and want to validate it end-to-end:

make validate-config

This runs YAML parsing, semantic validation, and a model/dataset modality check.


GPU Setup (Nvidia)

Show Nvidia GPU setup commands


# Make sure your GPU is detected
nvidia-smi

# Using apt (Debian/Ubuntu)
sudo apt-get update
sudo apt-get install nvidia-cuda-toolkit

# Using pacman (Arch)
sudo pacman -Syu
sudo pacman -S cuda

You can also install the Nvidia CUDA Toolkit using the official Nvidia download page.

PyTorch with CUDA Support

# Install PyTorch with CUDA 12.1 support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Or for CUDA 11.8
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Verify CUDA
python -c "import torch; print(torch.cuda.get_device_name(0))"

CPU-Only Installation

Show CPU-only PyTorch install commands


# Install PyTorch CPU version
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

# Update config to use CPU
# In your config.yaml:
device: "cpu"
gpu_index: null

Version Compatibility

Show version compatibility notes


All dependency versions are pinned to ensure:

  • Reproducible experiments across different systems
  • Compatibility between Ray, PyTorch, and supporting libraries
  • Stable performance and known behavior

Whenever you upgrade dependencies in pyproject.toml, re-run:

make setup
make validate-setup

to refresh the environment and verify compatibility.


Troubleshooting

Show troubleshooting tips


Issue: “ModuleNotFoundError: No module named ‘torch’”

Solution:

# Reinstall PyTorch (inside your uv / virtual environment)
pip uninstall torch torchvision torchaudio -y
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Or verify which Python is being used
which python
echo $VIRTUAL_ENV

Make sure you have activated the same environment that uv sync created (for example, .venv/).


Issue: “ImportError: cannot import name ‘ConfigValidator’”

This usually indicates that:

  • The project is not being run from the repository root, or
  • The environment does not include the FedPilot source tree.

Solution:

# From the project root
pwd  # should end with /core or your FedPilot repo root

# Ensure uv environment is up to date
make setup

# Validate that Python can see the project
python -c "import src.validators.config_validator as cv; print(cv.__name__)"

If you are using a custom environment or IDE, ensure that the project root is added to PYTHONPATH:

export PYTHONPATH="/path/to/FedPilot/core:$PYTHONPATH"

You can also run:

make validate-config

to confirm that the configuration and imports are working as expected.


Issue: Python version mismatch

Solution:

# Check Python version
python --version

# Use specific Python 3.12 interpreter
python3.12 --version
python3.12 -m venv fedpilot_env
source fedpilot_env/bin/activate

# Reinstall dependencies in the new environment
pip install uv
uv sync

Make sure make setup and make train are executed using the same Python environment.


Installation complete? Check out the Getting Started Guide to start using FedPilot.