File tree Expand file tree Collapse file tree 1 file changed +3
-3
lines changed
Expand file tree Collapse file tree 1 file changed +3
-3
lines changed Original file line number Diff line number Diff line change @@ -3065,9 +3065,9 @@ def _build_drafter_prepare_inputs_torchair_param(self):
30653065 def _to_list (self , sampled_token_ids : torch .Tensor ) -> list [list [int ]]:
30663066 # This is a short term mitigation for issue mentioned in
30673067 # https://github.com/vllm-project/vllm/issues/22754.
3068- # `tolist` would trigger a cuda wise stream sync, which
3069- # would block other copy ops from other cuda streams.
3070- # A cuda event sync would avoid such a situation. Since
3068+ # `tolist` would trigger a npu wise stream sync, which
3069+ # would block other copy ops from other npu streams.
3070+ # A npu event sync would avoid such a situation. Since
30713071 # this is in the critical path of every single model
30723072 # forward loop, this has caused perf issue for a disagg
30733073 # setup.
You can’t perform that action at this time.
0 commit comments