Commit 332b547
authored
[Bugfix] support mtp kv transfer and pp partition by hand in kv transfer (#4892)
### What this PR does / why we need it?
Current mooncake connector has following problems with PP and MTP
enabled:
1. MTP layer kv caches are not transfered, it may cause decreasing of
accept ratio: This PR add MTP layer indices for last PP stage after
calculating end_layer in transfer_kv_cache
2. While MTP enabled, PP layers divided by default may cause imbalance
between stages, we need to use `VLLM_PP_LAYER_PARTITION` environment to
make it balance by hand, but in mooncake connector kv transfer, decode
doesn't know the partition of prefill node: This PR add config
`pp_layer_partition` in `kv_connector_extra_config` to make decode node
acquire the partition information of prefill node.
### Does this PR introduce _any_ user-facing change?
When prefill using `VLLM_PP_LAYER_PARTITION` environment, add
`pp_layer_partition` in `kv_connector_extra_config` like below:
```
export VLLM_PP_LAYER_PARTITION=33,28
"kv_connector_extra_config": {
"use_ascend_direct": true,
"prefill": {
"dp_size": 1,
"tp_size": 8,
"pp_size": 2,
"pp_layer_partition": "33,28"
},
"decode": {
"dp_size": 16,
"tp_size": 1,
"pp_size": 1
}
}
```
### How was this patch tested?
- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e
---------
Signed-off-by: lidenghui <[email protected]>1 parent a47aa4d commit 332b547
File tree
2 files changed
+71
-14
lines changed- tests/ut/kv_connector
- vllm_ascend/distributed
2 files changed
+71
-14
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
242 | 242 | | |
243 | 243 | | |
244 | 244 | | |
245 | | - | |
| 245 | + | |
| 246 | + | |
246 | 247 | | |
247 | 248 | | |
248 | 249 | | |
| |||
295 | 296 | | |
296 | 297 | | |
297 | 298 | | |
298 | | - | |
| 299 | + | |
| 300 | + | |
299 | 301 | | |
300 | 302 | | |
301 | 303 | | |
| |||
352 | 354 | | |
353 | 355 | | |
354 | 356 | | |
355 | | - | |
| 357 | + | |
| 358 | + | |
356 | 359 | | |
357 | 360 | | |
358 | 361 | | |
| |||
434 | 437 | | |
435 | 438 | | |
436 | 439 | | |
437 | | - | |
| 440 | + | |
| 441 | + | |
438 | 442 | | |
439 | 443 | | |
440 | 444 | | |
| |||
498 | 502 | | |
499 | 503 | | |
500 | 504 | | |
501 | | - | |
| 505 | + | |
| 506 | + | |
502 | 507 | | |
503 | 508 | | |
504 | 509 | | |
| |||
535 | 540 | | |
536 | 541 | | |
537 | 542 | | |
| 543 | + | |
538 | 544 | | |
539 | 545 | | |
540 | 546 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
271 | 271 | | |
272 | 272 | | |
273 | 273 | | |
274 | | - | |
275 | | - | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
276 | 280 | | |
277 | | - | |
278 | | - | |
279 | | - | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
280 | 287 | | |
281 | 288 | | |
282 | 289 | | |
| |||
315 | 322 | | |
316 | 323 | | |
317 | 324 | | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
318 | 333 | | |
319 | 334 | | |
320 | 335 | | |
| |||
435 | 450 | | |
436 | 451 | | |
437 | 452 | | |
438 | | - | |
439 | | - | |
440 | | - | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
441 | 461 | | |
442 | 462 | | |
443 | 463 | | |
| |||
1020 | 1040 | | |
1021 | 1041 | | |
1022 | 1042 | | |
| 1043 | + | |
| 1044 | + | |
1023 | 1045 | | |
1024 | 1046 | | |
1025 | 1047 | | |
| |||
1126 | 1148 | | |
1127 | 1149 | | |
1128 | 1150 | | |
1129 | | - | |
| 1151 | + | |
| 1152 | + | |
1130 | 1153 | | |
1131 | 1154 | | |
1132 | 1155 | | |
| |||
1455 | 1478 | | |
1456 | 1479 | | |
1457 | 1480 | | |
| 1481 | + | |
| 1482 | + | |
| 1483 | + | |
| 1484 | + | |
| 1485 | + | |
| 1486 | + | |
| 1487 | + | |
| 1488 | + | |
| 1489 | + | |
| 1490 | + | |
| 1491 | + | |
| 1492 | + | |
| 1493 | + | |
| 1494 | + | |
| 1495 | + | |
| 1496 | + | |
| 1497 | + | |
| 1498 | + | |
| 1499 | + | |
| 1500 | + | |
| 1501 | + | |
| 1502 | + | |
| 1503 | + | |
| 1504 | + | |
| 1505 | + | |
| 1506 | + | |
| 1507 | + | |
| 1508 | + | |
0 commit comments