Skip to content

Conversation

@Pr0Wh1teGivee
Copy link
Contributor

@Pr0Wh1teGivee Pr0Wh1teGivee commented Nov 24, 2025

What this PR does / why we need it?

Enable force_load_balance in aclgraph, solving OOM issues.

Does this PR introduce any user-facing change?

No

How was this patch tested?

e2e & ut

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables force_load_balance for MoE layers when using ACL graph by removing a condition that previously disabled it. This allows torch.randint_like to be called within a graph context. My main concern is that this change contradicts an existing comment in the code which states that torch.randint_like is not supported in ACL graph, posing a critical risk of failure during profiling runs.

# to avoid accumulating too much tokens on a single rank.
# currently it is only activated when doing profile runs.
if enable_force_load_balance and not self.use_aclgraph:
if enable_force_load_balance:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This change enables torch.randint_like to be executed within an ACL graph context. This contradicts the comment on lines 55-58, which states that torch.randint_like is unsupported in ACL graph and was intentionally disabled. Executing an unsupported operation within a graph will cause failures during profiling runs. This introduces a critical risk of breaking profiling with ACL graph.

@yiz-liu yiz-liu added ready read for review ready-for-test start test by label for PR labels Nov 24, 2025
@Pr0Wh1teGivee Pr0Wh1teGivee force-pushed the aclgraph_force_lb_011 branch 2 times, most recently from 1bcd112 to a1bc7cd Compare November 24, 2025 09:14
@weijinqian0 weijinqian0 added ready read for review ready-for-test start test by label for PR and removed ready read for review ready-for-test start test by label for PR labels Nov 24, 2025
@wangxiyuan wangxiyuan merged commit a3164ac into vllm-project:v0.11.0-dev Nov 25, 2025
15 of 16 checks passed
@MengqingCao
Copy link
Collaborator

This is a cherry-pick of #4366

henryxuxu0716 pushed a commit to henryxuxu0716/vllm-ascend that referenced this pull request Nov 27, 2025
…oject#4367)

### What this PR does / why we need it?
Enable force_load_balance in aclgraph, solving OOM issues.
pick from vllm-project#4366
### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
e2e & ut

Signed-off-by: Pr0Wh1teGivee <[email protected]>
Signed-off-by: 刘哲续 <[email protected]>
Clorist33 pushed a commit to Clorist33/vllm-ascend that referenced this pull request Dec 9, 2025
…oject#4367)

### What this PR does / why we need it?
Enable force_load_balance in aclgraph, solving OOM issues.
pick from vllm-project#4366
### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
e2e & ut

Signed-off-by: Pr0Wh1teGivee <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:ops ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants