You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Mooncake:[AscendTransport/Mooncake at pooling-async-memcpy](https://github.com/AscendTransport/Mooncake/tree/pooling-async-memcpy)(Currently available branch code, continuously updated.)
12
-
Installation and Compilation Guide:https://github.com/AscendTransport/Mooncake/tree/pooling-async-memcpy?tab=readme-ov-file#build-and-use-binaries
11
+
* Mooncake:main branch
12
+
13
+
Installation and Compilation Guide:https://github.com/kvcache-ai/Mooncake?tab=readme-ov-file#build-and-use-binaries
14
+
15
+
Make sure to build with `-DUSE_ASCEND_DIRECT` to enable ADXL engine.
16
+
17
+
An example command for compiling ADXL:
18
+
19
+
`rm -rf build && mkdir -p build && cd build \ && cmake .. -DCMAKE_INSTALL_PREFIX=/opt/transfer-engine/ -DCMAKE_POLICY_VERSION_MINIMUM=3.5 -DUSE_ASCEND_DIRECT=ON -DBUILD_SHARED_LIBS=ON -DBUILD_UNIT_TESTS=OFF \ && make -j \ && make install`
20
+
21
+
Also, you need to set environment variables to point to them `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib64/python3.11/site-packages/mooncake`, or copy the .so files to the `/usr/local/lib64` directory after compilation
13
22
14
23
### KV Pooling Parameter Description
15
-
**kv_connector_extra_config**:Additional Configurable Parameters for Pooling
16
-
**mooncake_rpc_port**:Port for RPC Communication Between Pooling Scheduler Process and Worker Process: Each Instance Requires a Unique Port Configuration.
17
-
**load_async**:Whether to Enable Asynchronous Loading. The default value is false.
18
-
**register_buffer**:Whether to Register Video Memory with the Backend. Registration is Not Required When Used with MooncakeConnectorV1; It is Required in All Other Cases. The Default Value is false.
24
+
**kv_connector_extra_config**:Additional Configurable Parameters for Pooling.
25
+
**mooncake_rpc_port**:Port for RPC Communication Between Pooling Scheduler Process and Worker Process: Each Instance Requires a Unique Port Configuration.
26
+
**load_async**:Whether to Enable Asynchronous Loading. The default value is false.
27
+
**register_buffer**:Whether to Register Video Memory with the Backend. Registration is Not Required When Used with MooncakeConnectorV1; It is Required in All Other Cases. The Default Value is false.
19
28
20
29
## run mooncake master
21
30
@@ -29,26 +38,32 @@ The environment variable **MOONCAKE_CONFIG_PATH** is configured to the full path
29
38
"metadata_server": "P2PHANDSHAKE",
30
39
"protocol": "ascend",
31
40
"device_name": "",
41
+
"use_ascend_direct": true,
42
+
"alloc_in_same_node": true,
32
43
"master_server_address": "xx.xx.xx.xx:50088",
33
44
"global_segment_size": 30000000000
34
45
}
35
46
```
36
47
37
-
**local_hostname**: Configured as the IP address of the current master node,
38
-
**metadata_server**: Configured as **P2PHANDSHAKE**,
39
-
**protocol:** Configured for Ascend to use Mooncake's HCCL communication,
40
-
**device_name**: ""
41
-
**master_server_address**: Configured with the IP and port of the master service
42
-
**global_segment_size**: Expands the kvcache size registered by the PD node to the master
48
+
**local_hostname**: Configured as the IP address of the current master node.
49
+
**metadata_server**: Configured as **P2PHANDSHAKE**.
50
+
**protocol:** Configured for Ascend to use Mooncake's HCCL communication.
51
+
**device_name**: ""
52
+
**use_ascend_direct**: Indicator for using ADXL engine.
53
+
**alloc_in_same_node**: Indicator for preferring local buffer allocation strategy.
54
+
**master_server_address**: Configured with the IP and port of the master service.
55
+
**global_segment_size**: Expands the kvcache size registered by the PD node to the master.
`eviction_high_watermark_ratio` determines the watermark where Mooncake Store will perform eviction,and `eviction_ratio` determines the portion of stored objects that would be evicted.
66
+
52
67
## Pooling and Prefill Decode Disaggregate Scenario
# The upper boundary environment variable for memory swap logging is set to mooncake, where 1 indicates enabled and 0 indicates disabled.
75
-
export ASCEND_AGGREGATE_ENABLE=1
76
-
# The upper-level environment variable is the switch for enabling the mooncake aggregation function, where 1 means on and 0 means off.
88
+
export ASCEND_BUFFER_POOL=4:8
89
+
# ASCEND_BUFFER_POOL is the environment variable for configuring the number and size of buffer on NPU Device for aggregation and KV transfer,the value 4:8 means we allocate 4 buffers of size 8MB.
0 commit comments