Skip to content

Conversation

@BjarniHaukur
Copy link
Collaborator

I had to setup a git auth token to avoid rate limits.

Should suffice to put that in your rc as GITHUB_TOKEN.

Worthwhile to check your ~/.gitconfig and see if there are any overly broad token rules, but I don't believe that should be default.

Then to replicate training runs, just do:

# 1.) build crrl.sif
apptainer build crrl.sif scripts/train_container.def   

# train (needs env variables like hf token to push and CRRL_WORKDIR to point to your berzelius-2024-336/ filesystem)
# 2.) run job
sbatch scripts/grpo/large_grpo_lora_train_job.sh grpo.run_name="Qwen3-32B-GSPO-Nano-Throughput-Overnight"

@andre15silva
Copy link
Member

LGTM. I'll merge this one and then rebase my branch so that we can have it merged too.

@andre15silva andre15silva merged commit 911429d into master Dec 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants