Commit a6ef3ac
[Performance] Pre-issued exponential distribution operator. (#4908)
Pre-issued exponential distribution operator.
Result:
Single inference saves 200-300 microseconds.
before:
<img width="2257" height="1058" alt="2"
src="https://github.com/user-attachments/assets/c1da19e2-a439-42cb-9d7c-c0218e61fd4c"
/>
After:
<img width="2211" height="342" alt="image"
src="https://github.com/user-attachments/assets/03c84292-c802-4755-949c-4266a9a72fc0"
/>
- vLLM version: v0.12.0
- vLLM main:
vllm-project/vllm@ad32e3e
---------
Signed-off-by: weijinqian_v1 <[email protected]>
Co-authored-by: weijinqian_v1 <[email protected]>1 parent 0fbe083 commit a6ef3ac
File tree
3 files changed
+43
-3
lines changed- tests/ut/sample
- vllm_ascend
- sample
3 files changed
+43
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
20 | 21 | | |
21 | | - | |
| 22 | + | |
| 23 | + | |
22 | 24 | | |
| 25 | + | |
23 | 26 | | |
24 | 27 | | |
25 | 28 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
| 6 | + | |
| 7 | + | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
11 | 38 | | |
12 | 39 | | |
13 | 40 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| 54 | + | |
54 | 55 | | |
55 | 56 | | |
56 | 57 | | |
| |||
292 | 293 | | |
293 | 294 | | |
294 | 295 | | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
295 | 305 | | |
296 | 306 | | |
297 | 307 | | |
| |||
0 commit comments