Skip to content

Commit 749b183

Browse files
authored
[ModelRunner][Refactor] Refactor kv cache tensor initialization logic (vllm-project#3106)
### What this PR does / why we need it? Refactor kv cache tensor initialization logic. 1. Unify the kvcache tensor initialization logic of deepseek and normal models 2. spilt `initialize_kv_cache_tensors` into `_allocate_kv_cache_tensors` and `_reshape_kv_cache_tensors`, following gpu modelrunner in vllm ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. 1. prefill disaggregation scenario 4. deepseek + aclgraph/eager mode 5. qwen3 next - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: MengqingCao <[email protected]> Signed-off-by: luolun <[email protected]>
1 parent 239774b commit 749b183

File tree

1 file changed

+146
-222
lines changed

1 file changed

+146
-222
lines changed

0 commit comments

Comments
 (0)