-
Notifications
You must be signed in to change notification settings - Fork 644
【EPLB】Eplb Redundant Experts Bugfix #4232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【EPLB】Eplb Redundant Experts Bugfix #4232
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the expert load balancing (eplb) logic by removing the redundant global_redundant_expert_num parameter from determine_default_log2phy_map and its call sites. This change simplifies the code and centralizes the calculation of redundant experts. The modifications touch utility functions, core MoE layer initialization, and corresponding tests. While the refactoring is generally sound, there is a critical issue in one of the updated function calls that will cause a runtime error.
vllm_ascend/ops/common_fused_moe.py
Outdated
| self.log2phy = determine_default_log2phy_map( | ||
| self.global_num_experts, self.ep_size, self.ep_rank, | ||
| self.global_redundant_expert_num).npu() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call to determine_default_log2phy_map includes self.global_redundant_expert_num as a fourth argument. However, the function signature for determine_default_log2phy_map in vllm_ascend/eplb/core/eplb_utils.py was changed in this PR to only accept three arguments. This will cause a TypeError at runtime. Please remove the extra argument to match the updated function definition.
self.log2phy = determine_default_log2phy_map(
self.global_num_experts, self.ep_size, self.ep_rank).npu()22922dc to
df87bab
Compare
82fd03e to
afef799
Compare
MengqingCao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz update your pr message, and describe clearly the issue it fix and how this pr works on it
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
Signed-off-by: shenchuxiaofugui <[email protected]>
e5635b2 to
29337d5
Compare
What this PR does / why we need it?
Redundant experts bugfix
The calculation logic for redundant experts has been fixed, allowing the correct number of redundant experts to be calculated using the map. Therefore, there is no longer a need to set the redundant expert parameter when passing the map.
Does this PR introduce any user-facing change?
After configuring the path for experts_map, users do not need to configure iinit_redundancy_expert.
How was this patch tested?
The accuracy of EPLB was tested with and without the use of redundant experts.