Commit 8fdb689
[BugFix] Refactor ACL graph size adjustment for speculative decoding (#4640)
### What this PR does / why we need it?
Move the logic for adjusting ACL graph capture sizes for speculative
decoding from the generic utility module into a dedicated method within
the compilation configuration.
This change improves code organization and encapsulation by making the
compilation configuration responsible for managing its own state. The
model runner now triggers this adjustment directly, providing the
necessary context.
### Does this PR introduce _any_ user-facing change?
None.
### How was this patch tested?
None.
- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e
Signed-off-by: Yizhou Liu <[email protected]>
Co-authored-by: wangxiyuan <[email protected]>1 parent 688b133 commit 8fdb689
2 files changed
+12
-31
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
571 | 571 | | |
572 | 572 | | |
573 | 573 | | |
574 | | - | |
575 | | - | |
576 | | - | |
577 | | - | |
578 | | - | |
579 | | - | |
580 | | - | |
581 | | - | |
582 | | - | |
583 | | - | |
584 | | - | |
585 | | - | |
586 | | - | |
587 | | - | |
588 | | - | |
589 | | - | |
590 | | - | |
591 | | - | |
592 | | - | |
593 | | - | |
594 | 574 | | |
595 | 575 | | |
596 | 576 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4027 | 4027 | | |
4028 | 4028 | | |
4029 | 4029 | | |
| 4030 | + | |
| 4031 | + | |
| 4032 | + | |
| 4033 | + | |
| 4034 | + | |
| 4035 | + | |
| 4036 | + | |
| 4037 | + | |
| 4038 | + | |
| 4039 | + | |
4030 | 4040 | | |
4031 | 4041 | | |
4032 | 4042 | | |
| |||
4122 | 4132 | | |
4123 | 4133 | | |
4124 | 4134 | | |
4125 | | - | |
4126 | | - | |
4127 | | - | |
4128 | | - | |
4129 | | - | |
4130 | | - | |
4131 | | - | |
4132 | | - | |
4133 | | - | |
4134 | | - | |
4135 | | - | |
| 4135 | + | |
| 4136 | + | |
4136 | 4137 | | |
4137 | 4138 | | |
4138 | 4139 | | |
| |||
0 commit comments