-
Notifications
You must be signed in to change notification settings - Fork 645
[v0.11.0][Bugfix][MoE] enable force_load_balance in aclgraph #4367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v0.11.0][Bugfix][MoE] enable force_load_balance in aclgraph #4367
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enables force_load_balance for MoE layers when using ACL graph by removing a condition that previously disabled it. This allows torch.randint_like to be called within a graph context. My main concern is that this change contradicts an existing comment in the code which states that torch.randint_like is not supported in ACL graph, posing a critical risk of failure during profiling runs.
| # to avoid accumulating too much tokens on a single rank. | ||
| # currently it is only activated when doing profile runs. | ||
| if enable_force_load_balance and not self.use_aclgraph: | ||
| if enable_force_load_balance: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change enables torch.randint_like to be executed within an ACL graph context. This contradicts the comment on lines 55-58, which states that torch.randint_like is unsupported in ACL graph and was intentionally disabled. Executing an unsupported operation within a graph will cause failures during profiling runs. This introduces a critical risk of breaking profiling with ACL graph.
e0852e2 to
d62ec56
Compare
1bcd112 to
a1bc7cd
Compare
Signed-off-by: Pr0Wh1teGivee <[email protected]>
fe3cdfd to
242572d
Compare
|
This is a cherry-pick of #4366 |
…oject#4367) ### What this PR does / why we need it? Enable force_load_balance in aclgraph, solving OOM issues. pick from vllm-project#4366 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? e2e & ut Signed-off-by: Pr0Wh1teGivee <[email protected]> Signed-off-by: 刘哲续 <[email protected]>
…oject#4367) ### What this PR does / why we need it? Enable force_load_balance in aclgraph, solving OOM issues. pick from vllm-project#4366 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? e2e & ut Signed-off-by: Pr0Wh1teGivee <[email protected]>
What this PR does / why we need it?
Enable force_load_balance in aclgraph, solving OOM issues.
Does this PR introduce any user-facing change?
No
How was this patch tested?
e2e & ut