Commit eb4c08f
[bugfix] fix mtp accept rate (#5093)
### What this PR does / why we need it?
1. now, npu_model_runner reuses gpu_model_runner, this pr deletes some
attrs already defined in gpu_model_runner
2. fix mtp accept rate by disabling in_profile_run
3. remove redundant moe method selection logic
4. Reverts #5082, which broke CI in
https://github.com/vllm-project/vllm-ascend/actions/runs/20266314048/job/58190426832?pr=5088
### Does this PR introduce _any_ user-facing change?
NO
### How was this patch tested?
vLLM version: v0.12.0
vLLM main:
vllm-project/vllm@ad32e3e
vLLM version: v0.12.0
vLLM main:
vllm-project/vllm@ad32e3e
- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e
---------
Signed-off-by: zhenwenqi2024 <[email protected]>
Signed-off-by: Mengqing Cao <[email protected]>
Co-authored-by: Mengqing Cao <[email protected]>1 parent 5b1da4e commit eb4c08f
File tree
5 files changed
+9
-35
lines changed- csrc/matmul_allreduce_add_rmsnorm/op_host
- vllm_ascend
- spec_decode
- worker
5 files changed
+9
-35
lines changedLines changed: 4 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | 29 | | |
34 | 30 | | |
35 | 31 | | |
| |||
52 | 48 | | |
53 | 49 | | |
54 | 50 | | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
67 | | - | |
| 67 | + | |
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
| |||
298 | 298 | | |
299 | 299 | | |
300 | 300 | | |
301 | | - | |
302 | | - | |
303 | 301 | | |
304 | 302 | | |
305 | 303 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
145 | 145 | | |
146 | 146 | | |
147 | 147 | | |
148 | | - | |
149 | 148 | | |
150 | 149 | | |
151 | 150 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
293 | 293 | | |
294 | 294 | | |
295 | 295 | | |
296 | | - | |
297 | 296 | | |
298 | 297 | | |
299 | 298 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
244 | 244 | | |
245 | 245 | | |
246 | 246 | | |
247 | | - | |
248 | | - | |
249 | 247 | | |
250 | 248 | | |
251 | 249 | | |
| |||
338 | 336 | | |
339 | 337 | | |
340 | 338 | | |
341 | | - | |
342 | | - | |
343 | | - | |
344 | | - | |
345 | | - | |
346 | | - | |
347 | | - | |
348 | | - | |
349 | | - | |
350 | | - | |
351 | | - | |
352 | | - | |
353 | | - | |
354 | | - | |
355 | | - | |
356 | | - | |
357 | | - | |
358 | | - | |
359 | 339 | | |
360 | 340 | | |
361 | 341 | | |
| |||
386 | 366 | | |
387 | 367 | | |
388 | 368 | | |
389 | | - | |
390 | | - | |
391 | 369 | | |
392 | 370 | | |
| 371 | + | |
393 | 372 | | |
394 | 373 | | |
395 | 374 | | |
396 | 375 | | |
397 | 376 | | |
398 | 377 | | |
| 378 | + | |
| 379 | + | |
399 | 380 | | |
400 | | - | |
401 | 381 | | |
402 | | - | |
403 | 382 | | |
404 | | - | |
405 | | - | |
406 | 383 | | |
407 | 384 | | |
408 | 385 | | |
| |||
3395 | 3372 | | |
3396 | 3373 | | |
3397 | 3374 | | |
| 3375 | + | |
3398 | 3376 | | |
3399 | 3377 | | |
3400 | 3378 | | |
| |||
0 commit comments