Skip to content

Commit 9bd77b9

Browse files
authored
[v0.11.0][P/D]Set adxl as default backend and update readme (vllm-project#3771)
### What this PR does / why we need it? Set adxl engine as the default Mooncake backend, because Ascend Transport is no longer maintained. Update README to include instructions for installing the adxl backend Mooncake. ### Does this PR introduce _any_ user-facing change? Users need to compile and install the mooncake backend for adxl according to the revised README instructions. ### How was this patch tested? By CI. --------- Signed-off-by: nwpu-zxr <[email protected]>
1 parent 33bee2a commit 9bd77b9

File tree

3 files changed

+14
-45
lines changed

3 files changed

+14
-45
lines changed

docs/source/tutorials/multi_node_pd_disaggregation_mooncake.md

Lines changed: 12 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ for i in {0..15}; do hccn_tool -i $i -ping -g address x.x.x.x;done
5757
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI. First, we need to obtain the Mooncake project. Refer to the following command:
5858

5959
```shell
60-
git clone -b pooling_async_memecpy_v1 https://github.com/AscendTransport/Mooncake
60+
git clone https://github.com/kvcache-ai/Mooncake.git
6161
```
6262

6363
Update and install Python.
@@ -67,22 +67,25 @@ apt-get update
6767
apt-get install python3
6868
```
6969

70-
Install the relevant dependencies. The installation of Go is not required.
70+
Modify Mooncake compilation option
7171

7272
```shell
7373
cd Mooncake
74-
bash dependencies.sh -y
74+
vi mooncake-common/common.cmake
75+
# find this row and set USE_ASCEND_DIRECT ON.
76+
option(USE_ASCEND_DIRECT "option for using ascend npu with adxl engine" ON)
7577
```
7678

7779
Install mpi
7880

7981
```shell
80-
apt purge mpich libmpich-dev -y
81-
apt purge openmpi-bin -y
82-
apt purge openmpi-bin libopenmpi-dev -y
83-
apt install mpich libmpich-dev -y
84-
export CPATH=/usr/lib/aarch64-linux-gnu/mpich/include/:$CPATH
85-
export CPATH=/usr/lib/aarch64-linux-gnu/openmpi/lib:$CPATH
82+
apt-get install mpich libmpich-dev -y
83+
```
84+
85+
Install the relevant dependencies. The installation of Go is not required.
86+
87+
```shell
88+
bash dependencies.sh -y
8689
```
8790

8891
Compile and install
@@ -93,8 +96,6 @@ cd build
9396
cmake ..
9497
make -j
9598
make install
96-
cp mooncake-transfer-engine/src/transport/ascend_transport/hccl_transport/ascend_transport_c/libascend_transport_mem.so /usr/local/Ascend/ascend-toolkit/latest/python/site-packages/
97-
cp mooncake-transfer-engine/src/libtransfer_engine.so /usr/local/Ascend/ascend-toolkit/latest/python/site-packages/
9899
```
99100

100101
## Prefiller/Decoder Deployment
@@ -119,10 +120,6 @@ export VLLM_USE_V1=1
119120
export HCCL_BUFFSIZE=1024
120121
export OMP_PROC_BIND=false
121122
export OMP_NUM_THREADS=10
122-
export ASCEND_AGGREGATE_ENABLE=1 # enable aggregated transmission
123-
export ASCEND_TRANSPORT_PRINT=0 # print ascend transport logs
124-
export ACL_OP_INIT_MODE=1 # acl op initialization mode to prevent device id acquisition failure
125-
export ASCEND_A3_ENABLE=1 # enable hccs transmission for A3; set to 0 for A2
126123
export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/python/site-packages:$LD_LIBRARY_PATH
127124

128125
vllm serve /model/Qwen3-235B-A22B-W8A8 \
@@ -178,10 +175,6 @@ export VLLM_USE_V1=1
178175
export HCCL_BUFFSIZE=1024
179176
export OMP_PROC_BIND=false
180177
export OMP_NUM_THREADS=10
181-
export ASCEND_AGGREGATE_ENABLE=1
182-
export ASCEND_TRANSPORT_PRINT=0
183-
export ACL_OP_INIT_MODE=1
184-
export ASCEND_A3_ENABLE=1
185178
export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/python/site-packages:$LD_LIBRARY_PATH
186179

187180
vllm serve /model/Qwen3-235B-A22B-W8A8 \
@@ -237,10 +230,6 @@ export VLLM_USE_V1=1
237230
export HCCL_BUFFSIZE=2048
238231
export OMP_PROC_BIND=false
239232
export OMP_NUM_THREADS=10
240-
export ASCEND_AGGREGATE_ENABLE=1
241-
export ASCEND_TRANSPORT_PRINT=0
242-
export ACL_OP_INIT_MODE=1
243-
export ASCEND_A3_ENABLE=1
244233
export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/python/site-packages:$LD_LIBRARY_PATH
245234

246235
vllm serve /model/Qwen3-235B-A22B-W8A8 \
@@ -298,10 +287,6 @@ export VLLM_USE_V1=1
298287
export HCCL_BUFFSIZE=2048
299288
export OMP_PROC_BIND=false
300289
export OMP_NUM_THREADS=10
301-
export ASCEND_AGGREGATE_ENABLE=1
302-
export ASCEND_TRANSPORT_PRINT=0
303-
export ACL_OP_INIT_MODE=1
304-
export ASCEND_A3_ENABLE=1
305290
export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/python/site-packages:$LD_LIBRARY_PATH
306291

307292
vllm serve /model/Qwen3-235B-A22B-W8A8 \
@@ -366,10 +351,6 @@ export VLLM_USE_V1=1
366351
export HCCL_BUFFSIZE=1024
367352
export OMP_PROC_BIND=false
368353
export OMP_NUM_THREADS=10
369-
export ASCEND_AGGREGATE_ENABLE=1
370-
export ASCEND_TRANSPORT_PRINT=0
371-
export ACL_OP_INIT_MODE=1
372-
export ASCEND_A3_ENABLE=1
373354
export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/python/site-packages:$LD_LIBRARY_PATH
374355

375356
vllm serve /model/Qwen3-235B-A22B-W8A8 \
@@ -425,10 +406,6 @@ export VLLM_USE_V1=1
425406
export HCCL_BUFFSIZE=1024
426407
export OMP_PROC_BIND=false
427408
export OMP_NUM_THREADS=10
428-
export ASCEND_AGGREGATE_ENABLE=1
429-
export ASCEND_TRANSPORT_PRINT=0
430-
export ACL_OP_INIT_MODE=1
431-
export ASCEND_A3_ENABLE=1
432409
export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/python/site-packages:$LD_LIBRARY_PATH
433410

434411
vllm serve /model/Qwen3-235B-A22B-W8A8 \
@@ -484,10 +461,6 @@ export VLLM_USE_V1=1
484461
export HCCL_BUFFSIZE=2048
485462
export OMP_PROC_BIND=false
486463
export OMP_NUM_THREADS=10
487-
export ASCEND_AGGREGATE_ENABLE=1
488-
export ASCEND_TRANSPORT_PRINT=0
489-
export ACL_OP_INIT_MODE=1
490-
export ASCEND_A3_ENABLE=1
491464
export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/python/site-packages:$LD_LIBRARY_PATH
492465

493466
vllm serve /model/Qwen3-235B-A22B-W8A8 \
@@ -545,10 +518,6 @@ export VLLM_USE_V1=1
545518
export HCCL_BUFFSIZE=2048
546519
export OMP_PROC_BIND=false
547520
export OMP_NUM_THREADS=10
548-
export ASCEND_AGGREGATE_ENABLE=1
549-
export ASCEND_TRANSPORT_PRINT=0
550-
export ACL_OP_INIT_MODE=1
551-
export ASCEND_A3_ENABLE=1
552521
export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/python/site-packages:$LD_LIBRARY_PATH
553522

554523
vllm serve /model/Qwen3-235B-A22B-W8A8 \

vllm_ascend/distributed/mooncake_connector.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -899,7 +899,7 @@ def __init__(self, vllm_config: VllmConfig, engine_id: str):
899899
self.device_id = device_ids[self.tp_rank] # type: ignore
900900

901901
if vllm_config.kv_transfer_config.get_from_extra_config(
902-
'use_ascend_direct', False):
902+
'use_ascend_direct', True):
903903
hostname = self.side_channel_host
904904
else:
905905
hostname = f"{self.side_channel_host}:0:npu_{self.device_id}"

vllm_ascend/distributed/mooncake_layerwise_connector.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -656,7 +656,7 @@ def __init__(self, vllm_config: VllmConfig, engine_id: str):
656656
self.device_id = device_ids[self.tp_rank] # type: ignore
657657

658658
if vllm_config.kv_transfer_config.get_from_extra_config(
659-
'use_ascend_direct', False):
659+
'use_ascend_direct', True):
660660
hostname = self.side_channel_host
661661
else:
662662
hostname = f"{self.side_channel_host}:0:npu_{self.device_id}"

0 commit comments

Comments
 (0)