[Bugifx] fix quant_apply_mlp w1_scale type error & fix getting num_local_expert #4632

845473182 · 2025-12-02T09:11:24Z

What this PR does / why we need it?

Fix bugs introduced by bc67696

fix getting num_local_experet error in vllm_adaptor
fix w1_scale type error in moe_mlp.quant_apply_mlp.npu_dequant_swiglu_quant in w4a8 quantized scenario

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

gemini-code-assist

Code Review

This pull request introduces two important bug fixes. The first corrects how the number of local experts is determined in vllm_adaptor.py, making it more robust by using len() on what can be a list. The second fix addresses a type error in moe_mlp.py by correctly indexing into a list of tensors for a scale parameter. Both changes are correct and improve the stability of the code. I've added one suggestion to improve code readability in vllm_adaptor.py.

vllm_ascend/eplb/adaptor/vllm_adaptor.py

github-actions · 2025-12-02T09:46:53Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: 白永斌 <[email protected]>

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: 欧派果奶我还要 <[email protected]>

…cal_expert (vllm-project#4632) ### What this PR does / why we need it? Fix bugs introduced by vllm-project@bc67696 1. fix getting num_local_experet error in vllm_adaptor 2. fix w1_scale type error in moe_mlp.quant_apply_mlp.npu_dequant_swiglu_quant in w4a8 quantized scenario - vLLM version: v0.12.0 --------- Signed-off-by: 白永斌 <[email protected]> Signed-off-by: 欧派果奶我还要 <[email protected]> Co-authored-by: 白永斌 <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: wangxiyuan <[email protected]>

gemini-code-assist bot reviewed Dec 2, 2025

View reviewed changes

vllm_ascend/eplb/adaptor/vllm_adaptor.py Outdated Show resolved Hide resolved

github-actions bot added module:ops module:quantization labels Dec 2, 2025

wangxiyuan approved these changes Dec 4, 2025

View reviewed changes

白永斌 and others added 3 commits December 4, 2025 14:21

fix quant_apply_mlp w1_scale type error & fix getting num_local_expert

0a49676

Signed-off-by: 白永斌 <[email protected]>

fix layer.w13_weight_scale_list data error

da7b61e

Signed-off-by: 白永斌 <[email protected]>

Update vllm_ascend/eplb/adaptor/vllm_adaptor.py

96f0c6e

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: 欧派果奶我还要 <[email protected]>

845473182 force-pushed the gmm_swiglu_bugfix branch from c2ba15d to 96f0c6e Compare December 4, 2025 06:21

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Dec 4, 2025

wangxiyuan approved these changes Dec 5, 2025

View reviewed changes

Merge branch 'main' into gmm_swiglu_bugfix

7afabad

wangxiyuan merged commit a336543 into vllm-project:main Dec 5, 2025
15 of 17 checks passed

845473182 deleted the gmm_swiglu_bugfix branch December 5, 2025 08:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugifx] fix quant_apply_mlp w1_scale type error & fix getting num_local_expert #4632

[Bugifx] fix quant_apply_mlp w1_scale type error & fix getting num_local_expert #4632

845473182 commented Dec 2, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Bugifx] fix quant_apply_mlp w1_scale type error & fix getting num_local_expert #4632

[Bugifx] fix quant_apply_mlp w1_scale type error & fix getting num_local_expert #4632

Conversation

845473182 commented Dec 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

845473182 commented Dec 2, 2025 •

edited by github-actions bot

Loading