Skip to content

Commit 06f6cc1

Browse files
wujinyuan1wjy9595
andauthored
[Bugfix]Fix the hang issue of multimodal model when running with DP>1 (#4392)
### What this PR does / why we need it? When cudagraph_mode is set to FULL_DECODE_ONLY, if dp > 1, the dummy-run process will be triggered. When calling the update_attn_params function, the num_tokens parameter needs to be passed, and this value is obtained through positions.shape[0]. However, the multimodal model uses mRope (multi-dimensional rotary positional embeddings), which causes the shape of positions to be 2. As a result, the value obtained from positions.shape[0] is incorrect. We solve this problem by replacing positions.shape[0] with num_tokens. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? vLLM version: v0.11.0rc3 vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@2918c1b --------- Signed-off-by: wujinyuan1 <[email protected]> Co-authored-by: wujinyuan1 <[email protected]>
1 parent 84eae97 commit 06f6cc1

File tree

1 file changed

+2
-3
lines changed

1 file changed

+2
-3
lines changed

vllm_ascend/worker/model_runner_v1.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2810,16 +2810,15 @@ def _generate_dummy_run_hidden_states(self, with_prefill,
28102810
else:
28112811
# FIXME: Try using `auto_dispatch_capture=True`
28122812
update_mla_attn_params(self.update_stream, forward_context,
2813-
positions.shape[0],
2814-
self.speculative_config)
2813+
num_tokens, self.speculative_config)
28152814
else:
28162815
if self.pcp_size * self.dcp_size > 1:
28172816
update_attn_dcp_pcp_params(self.update_stream,
28182817
forward_context,
28192818
positions.shape[0])
28202819
else:
28212820
update_attn_params(self.update_stream, forward_context,
2822-
positions.shape[0])
2821+
num_tokens)
28232822

28242823
if self.drafter and self.drafter.name == SpecDcodeType.EAGLE3:
28252824
hidden_states, _ = hidden_states

0 commit comments

Comments
 (0)