Commit 08401b3
[Bugifx] fix quant_apply_mlp w1_scale type error & fix getting num_local_expert (vllm-project#4632)
### What this PR does / why we need it?
Fix bugs introduced by
vllm-project@bc67696
1. fix getting num_local_experet error in vllm_adaptor
2. fix w1_scale type error in
moe_mlp.quant_apply_mlp.npu_dequant_swiglu_quant in w4a8 quantized
scenario
- vLLM version: v0.12.0
---------
Signed-off-by: 白永斌 <[email protected]>
Signed-off-by: 欧派果奶我还要 <[email protected]>
Co-authored-by: 白永斌 <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: wangxiyuan <[email protected]>1 parent 2d08a8e commit 08401b3
File tree
3 files changed
+4
-4
lines changed- vllm_ascend
- eplb/adaptor
- ops/fused_moe
- quantization
3 files changed
+4
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
110 | | - | |
111 | | - | |
| 110 | + | |
| 111 | + | |
112 | 112 | | |
113 | 113 | | |
114 | 114 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
129 | 129 | | |
130 | 130 | | |
131 | 131 | | |
132 | | - | |
| 132 | + | |
133 | 133 | | |
134 | 134 | | |
135 | 135 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
289 | 289 | | |
290 | 290 | | |
291 | 291 | | |
292 | | - | |
| 292 | + | |
293 | 293 | | |
294 | 294 | | |
295 | 295 | | |
| |||
0 commit comments