[main][bugfix] bugfix for qwen3 moe quantization #4599

kunpengW-code · 2025-12-01T07:33:51Z

What this PR does / why we need it?

Fix the issue where the qwen3 moe service cannot be started due to upgrading the vllm version

Error info:
AttributeError: 'AscendFusedMoE' object has no attribute 'use dp chunking'

Does this PR introduce any user-facing change?

no

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: Wang Kunpeng <[email protected]>

gemini-code-assist

Code Review

This pull request addresses a bug in the Qwen3 MoE quantization by correctly initializing the base class for AscendFusedMoEMethod. The change ensures that the FusedMoEMethodBase constructor is called with the appropriate moe_config from the layer.

gemini-code-assist · 2025-12-01T07:35:20Z

vllm_ascend/quantization/quant_config.py

                 quant_config: AscendQuantConfig,
                 prefix: str,
                 packed_modules_mapping: Dict[str, Any],
                 layer: torch.nn.Module = None):


The layer parameter in the __init__ method is defined with a default value of None, but layer.moe_config is accessed unconditionally on the next line. This creates a potential AttributeError if AscendFusedMoEMethod is ever instantiated without providing a layer argument. Given that layer is always expected to be a FusedMoE instance when this method is called (as seen in AscendQuantConfig.get_quant_method), it should be made a required argument to reflect its actual usage and prevent potential runtime errors.

Suggested change

layer: torch.nn.Module = None):

layer: torch.nn.Module):

github-actions · 2025-12-01T08:11:40Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: Wang Kunpeng <[email protected]>

github-actions · 2025-12-01T11:46:56Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

…to qwen3-moe-bugfix # Conflicts: # tests/e2e/multicard/test_data_parallel.py

### What this PR does / why we need it? Fix the issue where the qwen3 moe service cannot be started due to upgrading the vllm version Error info: AttributeError: 'AscendFusedMoE' object has no attribute 'use dp chunking' ### Does this PR introduce _any_ user-facing change? no - vLLM version: v0.11.2 --------- Signed-off-by: Wang Kunpeng <[email protected]>

Sparkheart · 2025-12-01T14:07:51Z

vllm_ascend/quantization/quant_config.py

-                 layer: torch.nn.Module = None):
+    def __init__(self, quant_config: AscendQuantConfig, prefix: str,
+                 packed_modules_mapping: Dict[str,
+                                              Any], layer: torch.nn.Module):


Suggest formatting it.

### What this PR does / why we need it? Fix the issue where the qwen3 moe service cannot be started due to upgrading the vllm version Error info: AttributeError: 'AscendFusedMoE' object has no attribute 'use dp chunking' ### Does this PR introduce _any_ user-facing change? no - vLLM version: v0.11.2 --------- Signed-off-by: Wang Kunpeng <[email protected]> Signed-off-by: Che Ruan <[email protected]>

[main][bugfix] bugfix for qwen3 moe quantization

6d7db0f

Signed-off-by: Wang Kunpeng <[email protected]>

gemini-code-assist bot reviewed Dec 1, 2025

View reviewed changes

[main][bugfix] bugfix for qwen3 moe quantization

7e39bb5

Signed-off-by: Wang Kunpeng <[email protected]>

wangxiyuan approved these changes Dec 1, 2025

View reviewed changes

weijinqian0 approved these changes Dec 1, 2025

View reviewed changes

github-actions bot added the module:quantization label Dec 1, 2025

add e2e

0810b35

Signed-off-by: Wang Kunpeng <[email protected]>

github-actions bot added module:tests merge-conflicts labels Dec 1, 2025

Merge branch 'main' of https://github.com/vllm-project/vllm-ascend in…

4061750

…to qwen3-moe-bugfix # Conflicts: # tests/e2e/multicard/test_data_parallel.py

github-actions bot removed the merge-conflicts label Dec 1, 2025

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Dec 1, 2025

wangxiyuan merged commit a9c4b86 into vllm-project:main Dec 1, 2025
47 checks passed

Sparkheart reviewed Dec 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[main][bugfix] bugfix for qwen3 moe quantization #4599

[main][bugfix] bugfix for qwen3 moe quantization #4599

Uh oh!

kunpengW-code commented Dec 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

Uh oh!

Sparkheart Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[main][bugfix] bugfix for qwen3 moe quantization #4599

[main][bugfix] bugfix for qwen3 moe quantization #4599

Uh oh!

Conversation

kunpengW-code commented Dec 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

Uh oh!

Sparkheart Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kunpengW-code commented Dec 1, 2025 •

edited by github-actions bot

Loading