-
Notifications
You must be signed in to change notification settings - Fork 31.1k
Fix KeyError in GPT-OSS weight conversion script #42007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
zucchini-nlp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am quite lost here since I didn't work with GPT-OSS. To make sure, the original_config is an HF config object iiuc and we need to make sure it is Yarn if no params are saved in config?
| # "beta_fast": float(original_config.pop("rope_ntk_beta")), | ||
| # "beta_slow": float(original_config.pop("rope_ntk_alpha")), | ||
| # "factor": float(original_config.pop("rope_parameters_factor")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we hardcode the values safely? I think it was better to obtain from the config
|
Thanks for reviewing @zucchini-nlp! Let me clarify:
You're absolutely correct. We should try to obtain values from config with defaults as fallback. I'll update to use Updating now.. |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: gpt_oss |
|
Updated! changes made:
Ready for another look! |
|
Thanks @Aznix07 for making efforts to solve the issue raised. I noticed that the issue is mostly because model config.json contains "rope_scaling_factor" as the key instead of "rope_parameters_factor" which the script is looking for. |
zucchini-nlp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still not clear why the original_config has rope paramaters if it is not in HF format yet. I believe there are some configs already with "rope_parameters" to be converted
Lgtm then!
|
Thank you @zucchini-nlp for the review and approval! @shubhagr-qc Thanks for identifying |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
What does this PR do?
Fixes a
KeyError: 'rope_parameters_factor'in the GPT-OSS weight conversion script that was preventing users from converting GPT-OSS model weights to HuggingFace format.Fixes #42003
Problem
After PR #39847 standardized RoPE parameter handling across models, the GPT-OSS conversion script was still trying to access individual parameter keys that no longer exist in the original config:
Reproduction
Solution
Added proper handling for both config formats:
Testing
✅ Verified conversion script no longer throws KeyError
✅ Tested fallback logic with both config formats
✅ Default values match
configuration_gpt_oss.pyBefore submitting
Who can review?
@zucchini-nlp