Commit 749b183

authored

[ModelRunner][Refactor] Refactor kv cache tensor initialization logic (vllm-project#3106)

### What this PR does / why we need it? Refactor kv cache tensor initialization logic. 1. Unify the kvcache tensor initialization logic of deepseek and normal models 2. spilt `initialize_kv_cache_tensors` into `_allocate_kv_cache_tensors` and `_reshape_kv_cache_tensors`, following gpu modelrunner in vllm ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. 1. prefill disaggregation scenario 4. deepseek + aclgraph/eager mode 5. qwen3 next - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: MengqingCao <[email protected]> Signed-off-by: luolun <[email protected]>

1 parent 239774b commit 749b183Copy full SHA for 749b183

1 file changed

+146

-222

lines changed

vllm_ascend/worker
- model_runner_v1.py

1 file changed

+146

-222

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 749b183

1 file changed

1 file changed

File tree

1 file changed

1 file changed

0 commit comments