Skip to content

Commit 63561d6

Browse files
authored
[Fix] Sorts aclgraph batch sizes in ascending order (#4230)
### What this PR does / why we need it? Sorts aclgraph batch sizes in ascending order, corresponding to vLLM [#26016](vllm-project/vllm#26016) Ensures batch sizes for aclgraph are sorted ascending when aclgraph mode is enabled, improving consistency and compatibility with later logic that may depend on order. ### Does this PR introduce _any_ user-facing change? None. ### How was this patch tested? Waiting for #3886 - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@2918c1b Signed-off-by: Yizhou Liu <[email protected]>
1 parent e985432 commit 63561d6

File tree

1 file changed

+8
-3
lines changed

1 file changed

+8
-3
lines changed

vllm_ascend/worker/model_runner_v1.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -521,8 +521,12 @@ def __init__(self, vllm_config: VllmConfig, device: torch.device):
521521
if self.speculative_config else 0)
522522

523523
self.use_aclgraph = self._use_aclgraph()
524-
self.aclgraph_batch_sizes = list(
525-
reversed(self.compilation_config.cudagraph_capture_sizes))
524+
525+
# self.aclgraph_batch_sizes sorts in ascending order.
526+
if (self.compilation_config.cudagraph_capture_sizes and
527+
self.compilation_config.cudagraph_mode != CUDAGraphMode.NONE):
528+
self.aclgraph_batch_sizes = sorted(
529+
self.compilation_config.cudagraph_capture_sizes)
526530

527531
self.uniform_decode_query_len = 1 if not self.speculative_config else \
528532
1 + self.speculative_config.num_speculative_tokens
@@ -4101,7 +4105,8 @@ def _capture_model(self):
41014105
if aclgraph_mode.mixed_mode() != CUDAGraphMode.NONE:
41024106
aclgraph_runtime_mode = aclgraph_mode.mixed_mode()
41034107

4104-
compilation_cases = sorted(self.aclgraph_batch_sizes)
4108+
# make sure we capture the largest batch size first
4109+
compilation_cases = list(reversed(self.aclgraph_batch_sizes))
41054110

41064111
try:
41074112
self._capture_aclgraphs(

0 commit comments

Comments
 (0)