You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The vLLM variant supports the following build arguments:
256
+
257
+
-**VLLM_VERSION**: The vLLM version to install (default: `0.11.0`)
258
+
-**VLLM_CUDA_VERSION**: The CUDA version suffix for the wheel (default: `cu129`)
259
+
-**VLLM_PYTHON_TAG**: The Python compatibility tag (default: `cp38-abi3`, compatible with Python 3.8+)
260
+
261
+
#### Multi-Architecture Support
262
+
263
+
The vLLM variant supports both x86_64 (amd64) and aarch64 (arm64) architectures. The build process automatically selects the appropriate prebuilt wheel:
The vLLM wheels are sourced from the official vLLM GitHub Releases at `https://github.com/vllm-project/vllm/releases`, which provides prebuilt wheels for each release version.
291
+
231
292
## API Examples
232
293
233
294
The Model Runner exposes a REST API that can be accessed via TCP port. You can interact with it using curl commands.
0 commit comments