You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -344,7 +346,7 @@ The configuration compilation_config = { "cudagraph_mode": "FULL_DECODE_ONLY"} i
344
346
### 8. Asynchronous Scheduling
345
347
Asynchronous scheduling is a technique used to optimize inference efficiency. It allows non-blocking task scheduling to improve concurrency and throughput, especially when processing large-scale models.
346
348
347
-
This optimization is enabled by setting `--async-scheduling`.
349
+
This optimization is enabled by setting `--async-scheduling`.
0 commit comments