Skip to content

Conversation

@845473182
Copy link
Contributor

@845473182 845473182 commented Dec 2, 2025

What this PR does / why we need it?

Fix bugs introduced by bc67696

  1. fix getting num_local_experet error in vllm_adaptor
  2. fix w1_scale type error in moe_mlp.quant_apply_mlp.npu_dequant_swiglu_quant in w4a8 quantized scenario

Does this PR introduce any user-facing change?

How was this patch tested?

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two important bug fixes. The first corrects how the number of local experts is determined in vllm_adaptor.py, making it more robust by using len() on what can be a list. The second fix addresses a type error in moe_mlp.py by correctly indexing into a list of tensors for a scale parameter. Both changes are correct and improve the stability of the code. I've added one suggestion to improve code readability in vllm_adaptor.py.

@github-actions
Copy link

github-actions bot commented Dec 2, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

白永斌 and others added 3 commits December 4, 2025 14:21
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: 欧派果奶我还要 <[email protected]>
@wangxiyuan wangxiyuan added ready read for review ready-for-test start test by label for PR labels Dec 4, 2025
@wangxiyuan wangxiyuan merged commit a336543 into vllm-project:main Dec 5, 2025
15 of 17 checks passed
@845473182 845473182 deleted the gmm_swiglu_bugfix branch December 5, 2025 08:08
Meihan-chen pushed a commit to Meihan-chen/vllm-ascend that referenced this pull request Dec 5, 2025
…cal_expert (vllm-project#4632)

### What this PR does / why we need it?
Fix bugs introduced by
vllm-project@bc67696
1. fix getting num_local_experet error in vllm_adaptor
2. fix w1_scale type error in
moe_mlp.quant_apply_mlp.npu_dequant_swiglu_quant in w4a8 quantized
scenario

- vLLM version: v0.12.0

---------

Signed-off-by: 白永斌 <[email protected]>
Signed-off-by: 欧派果奶我还要 <[email protected]>
Co-authored-by: 白永斌 <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: wangxiyuan <[email protected]>
realliujiaxu pushed a commit to realliujiaxu/vllm-ascend that referenced this pull request Dec 6, 2025
…cal_expert (vllm-project#4632)

### What this PR does / why we need it?
Fix bugs introduced by
vllm-project@bc67696
1. fix getting num_local_experet error in vllm_adaptor
2. fix w1_scale type error in
moe_mlp.quant_apply_mlp.npu_dequant_swiglu_quant in w4a8 quantized
scenario

- vLLM version: v0.12.0

---------

Signed-off-by: 白永斌 <[email protected]>
Signed-off-by: 欧派果奶我还要 <[email protected]>
Co-authored-by: 白永斌 <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: wangxiyuan <[email protected]>
realliujiaxu pushed a commit to realliujiaxu/vllm-ascend that referenced this pull request Dec 6, 2025
…cal_expert (vllm-project#4632)

### What this PR does / why we need it?
Fix bugs introduced by
vllm-project@bc67696
1. fix getting num_local_experet error in vllm_adaptor
2. fix w1_scale type error in
moe_mlp.quant_apply_mlp.npu_dequant_swiglu_quant in w4a8 quantized
scenario

- vLLM version: v0.12.0

---------

Signed-off-by: 白永斌 <[email protected]>
Signed-off-by: 欧派果奶我还要 <[email protected]>
Co-authored-by: 白永斌 <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: wangxiyuan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:ops module:quantization ready read for review ready-for-test start test by label for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants