Adding ccl_enabled flag during model loading and passing CCL lists during compilation process #623

vjanfaza · 2025-11-18T17:08:13Z

In these changes, instead of passing CCL lists during model loading, I passed a flag called ccl_enabled to specify whether CCL feature is enabled or not and moved passing CCL lists to compilation process.

quic-mamta · 2025-11-19T08:45:23Z

@vjanfaza , Can you please resolve the conflicts on the PR and run lint/format checks?

vjanfaza · 2025-11-20T00:18:32Z

@vjanfaza , Can you please resolve the conflicts on the PR and run lint/format checks?

I resolved the conflicts and pushed the changes.

…ring compilation process Signed-off-by: Vahid Janfaza <[email protected]>

…27b.yaml Signed-off-by: vjanfaza <[email protected]>

…4b.yaml Signed-off-by: vjanfaza <[email protected]>

quic-mamta · 2025-11-27T08:39:52Z

examples/performance/compute_context_length/qwen3moe_example/ccl_qwen3moe_inference.py

-comp_ctx_lengths_prefill = [256, 512, ctx_len]
-comp_ctx_lengths_decode = [256, 512, ctx_len]
+# In moe models when compiling with prefill_seq_len=1 and non-continuous-batching mode, prefill and decode will share the same ccl specializations.
+comp_ctx_lengths_prefill = [256, 512, ctx_len]  # None #


nit; please remove the #None # at the end of this line from other places/files as well.

quic-mamta · 2025-11-27T08:41:17Z

examples/performance/compute_context_length/qwen3moe.py

+
+model_name = "Qwen/Qwen3-30B-A3B-Instruct-2507"
+"""
+# For CB inference, set continuous_batching to True and add full_batch_size,mxfp6,mint8 argument in compile function


nit, should be mxint8 not mint8

quic-mamta · 2025-11-27T08:43:30Z

examples/performance/compute_context_length/qwen3moe.py

+    comp_ctx_lengths_prefill=comp_ctx_lengths_prefill,
+    comp_ctx_lengths_decode=comp_ctx_lengths_decode,
+)
+# mos=1,


please remove this line.

quic-mamta · 2025-11-27T08:45:13Z

examples/performance/compute_context_length/qwen2_5_vl_cb.py

    processor=processor,
    images=image_urls,
    generation_len=100,
+    device_ids=[28, 29, 30, 31],


make these as [0,1,2,3]

quic-mamta · 2025-11-27T08:47:32Z

examples/performance/compute_context_length/llama4.py

    inputs["pixel_values"] = inputs["pixel_values"].to(torch.float32)
    streamer = TextStreamer(tokenizer)
-    output = qeff_model.generate(inputs=inputs, device_ids=[0, 1, 2, 3], generation_len=100)
+    output = qeff_model.generate(inputs=inputs, device_ids=[8, 9, 10, 11], generation_len=100)


this should be kept as original.

vjanfaza requested review from ochougul, quic-amitraj, quic-hemagnih and quic-rishinr as code owners November 18, 2025 17:08

quic-rishinr requested a review from quic-mamta November 19, 2025 08:28

vjanfaza closed this Nov 19, 2025

vjanfaza force-pushed the CCL-main branch from 024ca29 to 30c334b Compare November 19, 2025 18:44

vjanfaza reopened this Nov 19, 2025

vjanfaza closed this Nov 23, 2025

vjanfaza force-pushed the CCL-main branch from f48fbfc to c17be77 Compare November 23, 2025 18:24

vjanfaza added 4 commits November 23, 2025 10:26

Adding ccl_enabled flag during model loading and passing CCL lists du…

fc4fb62

…ring compilation process Signed-off-by: Vahid Janfaza <[email protected]>

Adding ccl_enabled flag during model loading and passing CCL lists du…

00ba70c

…ring compilation process Signed-off-by: Vahid Janfaza <[email protected]>

Adding ccl_enabled flag during model loading and passing CCL lists du…

308b5b0

…ring compilation process Signed-off-by: Vahid Janfaza <[email protected]>

Adding ccl_enabled flag during model loading and passing CCL lists du…

9fb6c78

…ring compilation process Signed-off-by: Vahid Janfaza <[email protected]>

vjanfaza reopened this Nov 24, 2025

vjanfaza added 3 commits November 23, 2025 17:52

Adding ccl_enabled flag during model loading and passing CCL lists du…

661e61e

…ring compilation process Signed-off-by: Vahid Janfaza <[email protected]>

Delete examples/performance/compute_context_length/fp32_nodes_gemma3_…

1acad56

…27b.yaml Signed-off-by: vjanfaza <[email protected]>

Delete examples/performance/compute_context_length/fp32_nodes_gemma3_…

fcecf0a

…4b.yaml Signed-off-by: vjanfaza <[email protected]>

quic-mamta reviewed Nov 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding ccl_enabled flag during model loading and passing CCL lists during compilation process #623

Adding ccl_enabled flag during model loading and passing CCL lists during compilation process #623

vjanfaza commented Nov 18, 2025

Uh oh!

quic-mamta commented Nov 19, 2025

Uh oh!

vjanfaza commented Nov 20, 2025 •

edited

Loading

Uh oh!

quic-mamta Nov 27, 2025

Uh oh!

quic-mamta Nov 27, 2025

Uh oh!

quic-mamta Nov 27, 2025

Uh oh!

quic-mamta Nov 27, 2025

Uh oh!

quic-mamta Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Adding ccl_enabled flag during model loading and passing CCL lists during compilation process #623

Are you sure you want to change the base?

Adding ccl_enabled flag during model loading and passing CCL lists during compilation process #623

Conversation

vjanfaza commented Nov 18, 2025

Uh oh!

quic-mamta commented Nov 19, 2025

Uh oh!

vjanfaza commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quic-mamta Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

quic-mamta Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

quic-mamta Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

quic-mamta Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

quic-mamta Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vjanfaza commented Nov 20, 2025 •

edited

Loading