You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The vLLM variant supports the following build arguments:
257
+
258
+
-**VLLM_VERSION**: The vLLM version to install (default: `0.11.0`)
259
+
-**VLLM_COMMIT_SHA**: The git commit SHA corresponding to the vLLM version (default: `b8b302cde434df8c9289a2b465406b47ebab1c2d` for v0.11.0)
260
+
-**VLLM_CUDA_VERSION**: The CUDA version suffix for the wheel (default: `cu129`)
261
+
-**VLLM_PYTHON_TAG**: The Python compatibility tag (default: `cp38-abi3`, compatible with Python 3.8+)
262
+
263
+
#### Multi-Architecture Support
264
+
265
+
The vLLM variant supports both x86_64 (amd64) and aarch64 (arm64) architectures. The build process automatically selects the appropriate prebuilt wheel:
The vLLM wheels are sourced from the official vLLM wheel repository at `https://wheels.vllm.ai/{commit_sha}/vllm/`, which provides prebuilt wheels for every commit.
305
+
231
306
## API Examples
232
307
233
308
The Model Runner exposes a REST API that can be accessed via TCP port. You can interact with it using curl commands.
0 commit comments