Skip to content

Commit dd4c177

Browse files
committed
update comment
Signed-off-by: jesse <[email protected]>
1 parent d81f665 commit dd4c177

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

vllm_ascend/worker/model_runner_v1.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3065,9 +3065,9 @@ def _build_drafter_prepare_inputs_torchair_param(self):
30653065
def _to_list(self, sampled_token_ids: torch.Tensor) -> list[list[int]]:
30663066
# This is a short term mitigation for issue mentioned in
30673067
# https://github.com/vllm-project/vllm/issues/22754.
3068-
# `tolist` would trigger a cuda wise stream sync, which
3069-
# would block other copy ops from other cuda streams.
3070-
# A cuda event sync would avoid such a situation. Since
3068+
# `tolist` would trigger a npu wise stream sync, which
3069+
# would block other copy ops from other npu streams.
3070+
# A npu event sync would avoid such a situation. Since
30713071
# this is in the critical path of every single model
30723072
# forward loop, this has caused perf issue for a disagg
30733073
# setup.

0 commit comments

Comments
 (0)