Skip to content

Commit 20b8fa9

Browse files
committed
typo
Signed-off-by: Pz1116 <[email protected]>
1 parent c1ca254 commit 20b8fa9

File tree

2 files changed

+5
-5
lines changed

2 files changed

+5
-5
lines changed

docs/source/developer_guide/feature_guide/KV_Cache_Pool_Guide.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ By introducing KV Connector V1, users can seamlessly combine HBM-based Prefix Ca
4545

4646
When used together with Mooncake PD (Prefill-Decode) Disaggregation, the KV Cache Pool can further decouple prefill and decode stages across devices or nodes.
4747

48-
Currently, we only perform put and get operation of KV Pool for **Prefiil Nodes**, and Decode Nodes get their KV Cache from Mooncake P2P KV Connector, i.e. MooncakeConnector.
48+
Currently, we only perform put and get operation of KV Pool for **Prefill Nodes**, and Decode Nodes get their KV Cache from Mooncake P2P KV Connector, i.e. MooncakeConnector.
4949

5050
The key benefit of doing this is that we can keep the gain in performance by computing less with Prefix Caching from HBM and KV Pool for Prefill Nodes while not sacrificing the data transfer efficiency between Prefill and Decode nodes with P2P KV Connector that transfer KV Caches between NPU devices directly.
5151

@@ -80,4 +80,4 @@ The KV Connector methods that need to be implemented can be categorized into sch
8080

8181
1. Currently, Mooncake Store for vLLM-Ascend only supports DRAM as the storage for KV Cache pool.
8282

83-
2. For now, if we successfully looked up a key and found it exists, but failed to get it when calling KV Pool's get function, we just output a log indicating the get operation failed and keep going; hence, the accuracy of that specific request may be affected. gWe will handle this situation by falling back the request and re-compute everything assuming there's no prefix cache hit (or even better, revert only one block and keep using the Prefix Caches before that).
83+
2. For now, if we successfully looked up a key and found it exists, but failed to get it when calling KV Pool's get function, we just output a log indicating the get operation failed and keep going; hence, the accuracy of that specific request may be affected. We will handle this situation by falling back the request and re-compute everything assuming there's no prefix cache hit (or even better, revert only one block and keep using the Prefix Caches before that).

docs/source/user_guide/feature_guide/kv_pool_mooncake.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -108,14 +108,14 @@ python3 -m vllm.entrypoints.openai.api_server \
108108
}
109109
}
110110
},
111-
{
111+
{
112112
"kv_connector": "MooncakeConnectorStoreV1",
113113
"kv_role": "kv_producer",
114114
"mooncake_rpc_port":"0"
115115
}
116116
]
117117
}
118-
}' > p.log 2>&1
118+
}' > p.log 2>&1
119119
```
120120

121121
`decode` Node:
@@ -156,7 +156,7 @@ python3 -m vllm.entrypoints.openai.api_server \
156156
"kv_connector_extra_config": {
157157
"use_layerwise": false,
158158
"connectors": [
159-
{
159+
{
160160
"kv_connector": "MooncakeConnectorV1",
161161
"kv_role": "kv_consumer",
162162
"kv_port": "20002",

0 commit comments

Comments
 (0)