Add AfmoeForCausalLM support #16477

bartowski1182 · 2025-10-08T20:58:40Z

Adds support for upcoming AfmoeForCausalLM

Tokenizer is public ahead of model launch to avoid breaking conversion code

Make sure to read the contributing guidelines before submitting a PR

convert_hf_to_gguf.py

models/ggml-vocab-afmoe.gguf

models/ggml-vocab-afmoe.gguf.inp

models/ggml-vocab-afmoe.gguf.out

src/llama-model.cpp

src/models/afmoe.cpp

convert_hf_to_gguf.py

src/models/afmoe.cpp

convert_hf_to_gguf_update.py

src/unicode.cpp

src/llama-vocab.cpp

convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <[email protected]>

ngxson · 2025-11-14T09:16:09Z

src/models/afmoe.cpp

+    inpL = ggml_scale(ctx0, inpL, sqrtf(float(n_embd)));
+    cb(inpL, "inp_embd_scaled", -1);


not very important to fix right now, but if the model supports multimodal in the future, you may need to skip scaling if the input is non-text:

llama.cpp/src/models/gemma3-iswa.cpp

Lines 11 to 15 in 45c6ef7

// important: do not normalize weights for raw embeddings input (i.e. encoded image emdeddings)

if (ubatch.token) {

inpL = ggml_scale(ctx0, inpL, sqrtf(n_embd));

cb(inpL, "inp_scaled", -1);

}

Interesting, good to know thanks !

CISC · 2025-11-14T12:18:16Z

@bartowski1182 @ggerganov Merging in a little while unless you have anything more to add.

* Add AFMOE model support * Update to vocab * Add model sizing * Undo Rope change for ARCEE model * Address review comments * Update modeling code is_sliding -> use_rope, replace hard-coded logic * Fix AFMOE tokenizer * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <[email protected]> * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <[email protected]> * Update AFMoE tokenizer class identification to be more unique --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>

github-actions bot added the python python script changes label Oct 8, 2025

DajanaV mentioned this pull request Nov 5, 2025

UPSTREAM PR #16477: Add AfmoeForCausalLM support auroralabs-loci/llama.cpp#95

Open

bartowski1182 force-pushed the master branch from ca2e99c to 763e822 Compare November 5, 2025 19:17

github-actions bot added the model Model specific label Nov 5, 2025

Add AFMOE model support

3fd69c5

bartowski1182 force-pushed the master branch from 763e822 to 3fd69c5 Compare November 6, 2025 16:58

bartowski1182 added 2 commits November 6, 2025 12:29

Update to vocab

93a2fb4

Add model sizing

7c5d718

bartowski1182 marked this pull request as ready for review November 13, 2025 17:36

bartowski1182 requested review from CISC and ggerganov as code owners November 13, 2025 17:36

Undo Rope change for ARCEE model

13aaafe

CISC reviewed Nov 13, 2025

View reviewed changes

Address review comments

34dc2a3

bartowski1182 commented Nov 13, 2025

View reviewed changes

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

ngxson reviewed Nov 13, 2025

View reviewed changes

src/models/afmoe.cpp Outdated Show resolved Hide resolved

bartowski1182 added 2 commits November 13, 2025 16:37

Update modeling code is_sliding -> use_rope, replace hard-coded logic

1b9558f

Fix AFMOE tokenizer

e41a5bd

CISC reviewed Nov 13, 2025

View reviewed changes

convert_hf_to_gguf_update.py Show resolved Hide resolved

bartowski1182 commented Nov 13, 2025

View reviewed changes

src/unicode.cpp Outdated Show resolved Hide resolved

CISC reviewed Nov 13, 2025

View reviewed changes

src/llama-vocab.cpp Outdated Show resolved Hide resolved

CISC reviewed Nov 13, 2025

View reviewed changes

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

CISC reviewed Nov 13, 2025

View reviewed changes

convert_hf_to_gguf.py Show resolved Hide resolved

bartowski1182 and others added 3 commits November 13, 2025 18:03

Update convert_hf_to_gguf.py

e27c3f1

Co-authored-by: Sigbjørn Skjæret <[email protected]>

Update convert_hf_to_gguf.py

684aead

Co-authored-by: Sigbjørn Skjæret <[email protected]>

Update AFMoE tokenizer class identification to be more unique

ddddf8d

CISC approved these changes Nov 14, 2025

View reviewed changes

ngxson approved these changes Nov 14, 2025

View reviewed changes

CISC merged commit e1fcf8b into ggml-org:master Nov 14, 2025
76 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add AfmoeForCausalLM support #16477

Add AfmoeForCausalLM support #16477

Uh oh!

bartowski1182 commented Oct 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ngxson Nov 14, 2025

Uh oh!

bartowski1182 Nov 14, 2025

Uh oh!

CISC commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		inpL = ggml_scale(ctx0, inpL, sqrtf(float(n_embd)));
		cb(inpL, "inp_embd_scaled", -1);

	// important: do not normalize weights for raw embeddings input (i.e. encoded image emdeddings)
	if (ubatch.token) {
	inpL = ggml_scale(ctx0, inpL, sqrtf(n_embd));
	cb(inpL, "inp_scaled", -1);
	}

Add AfmoeForCausalLM support #16477

Add AfmoeForCausalLM support #16477

Uh oh!

Conversation

bartowski1182 commented Oct 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ngxson Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

bartowski1182 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

CISC commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants