This repository demonstrates a minimal pipeline for decentralized model training using the BOINC server tools. Volunteers download tasks, perform local fine‑tuning on small quantized models and send weight updates back to the server.
router.py -> scheduler.py -> generate_wu.py -> BOINC -> volunteer.py -> validator.py -> fed_avg.py
- Install the BOINC server tools
See the BOINC wiki for complete instructions.
sudo apt-get install boinc-server-maker boinc-server-binaries boinc-server-tools
- Create a project directory
make_project agiNet cd agiNet - Add the skill apps under
apps/. Example subfolders:apps/visionapps/languageapps/memoryapps/plannerEach folder should contain a small quantized model file and atrain_local.pyrunner.
- Prepare the training script.
train_local.pyloads the weights, fine‑tunes on the provided dataset and writes weight deltas and a reward score. - Create work-unit templates using
wu_template.xmlandresult_template.xml. Each work unit contains the skill name, current weights, a mini dataset or prompts and the training script. - Generate work units with
sched_create_workso volunteers can download tasks. - Automate generation with
scheduler.pywhich selects the best skill and creates the next work unit. - Route tasks to the best skill with
router.py. - Volunteers train locally using
client/volunteer.pywhich encrypts updates before upload. - Validate returned work using
validator.py. Check hashes, run a quick evaluation and reject results that score below a threshold. - Aggregate accepted updates nightly with
fed_avg.pywhich now uses the FedOpt algorithm with momentum and writes the encrypted global weights. - Grant BOINC credit only when a node’s update passed validation and log the result via
scoreboard.py. - Publish snapshots and metrics using
nightly_snapshot.shso others can track progress.
The server/ directory of this repository provides example scripts and templates to get started.
To tailor work units to different volunteer tiers, use server/resource_sharder.py with a
JSON resource description. The tool splits your dataset, scales the per-shard hyperparameters,
and calls generate_wu.py so each host receives a fitting task.