-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Add AfmoeForCausalLM support #16477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AfmoeForCausalLM support #16477
Conversation
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
| inpL = ggml_scale(ctx0, inpL, sqrtf(float(n_embd))); | ||
| cb(inpL, "inp_embd_scaled", -1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not very important to fix right now, but if the model supports multimodal in the future, you may need to skip scaling if the input is non-text:
llama.cpp/src/models/gemma3-iswa.cpp
Lines 11 to 15 in 45c6ef7
| // important: do not normalize weights for raw embeddings input (i.e. encoded image emdeddings) | |
| if (ubatch.token) { | |
| inpL = ggml_scale(ctx0, inpL, sqrtf(n_embd)); | |
| cb(inpL, "inp_scaled", -1); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, good to know thanks !
|
@bartowski1182 @ggerganov Merging in a little while unless you have anything more to add. |
* Add AFMOE model support * Update to vocab * Add model sizing * Undo Rope change for ARCEE model * Address review comments * Update modeling code is_sliding -> use_rope, replace hard-coded logic * Fix AFMOE tokenizer * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <[email protected]> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <[email protected]> * Update AFMoE tokenizer class identification to be more unique --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>
Adds support for upcoming AfmoeForCausalLM
Tokenizer is public ahead of model launch to avoid breaking conversion code
Make sure to read the contributing guidelines before submitting a PR