Skip to content

Commit 5fed166

Browse files
authored
[ModelRunner][Refactor] Refactor kv cache tensor initialization logic (#3106)
### What this PR does / why we need it? Refactor kv cache tensor initialization logic. 1. Unify the kvcache tensor initialization logic of deepseek and normal models 2. spilt `initialize_kv_cache_tensors` into `_allocate_kv_cache_tensors` and `_reshape_kv_cache_tensors`, following gpu modelrunner in vllm ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. 1. prefill disaggregation scenario 4. deepseek + aclgraph/eager mode 5. qwen3 next - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: MengqingCao <[email protected]>
1 parent bedf223 commit 5fed166

File tree

1 file changed

+146
-222
lines changed

1 file changed

+146
-222
lines changed

0 commit comments

Comments
 (0)