Release vllm-stack-0.1.2 · vllm-project/production-stack

The stack deployment of vLLM

What's Changed

[Feat] Adding support to turn on/off engine deployment by @dumb0002 #311
[Feat] Add nodeSelectorTerms for router & cacher servers by @kinoute #314
[Bugfix] Update logger handler to handle stdout/stderr properly @corona10 #320
[CI] Always upload logs of Helm functionality checks @pwuersch #321
[CI/Build] Remove sudo requirements in CI/CD @Shaoting-Feng #325
[Feat] Multiple service creation when multiple models specified @lucas-tucker #326
[CI] Add coverage tracking @zhuohangu #330
[CLI/Doc]Update on gke deployment with gpu quota @EaminC #334
[Bugfix] Fix thread creation to pass parameters properly. @corona10 #336
[Feat] OpenTelemetry Support Example @lucas-tucker #346
[Feat] Tool calling support for MCP client integration @YuhanLiu11 #352
[Benchmark] Add api key option @Kimdongui #354
[Bugfix] fix init container pvc volume mount @zerofishnoodles #359
[Feat] Enabled latency monitor and added average latency computation logic @insukim1994 #362
[Feat] Added a tutorial document for deploying production stack on amd gpus @insukim1994 #364
[Bugfix] Deprecated least loaded routing logic @insukim1994 #366
[Bugfix] added model name to deployment selector @TamKej #367
[Feat] helm: add routerSpec.serviceType value @marquiz #368
[Feat] Support Multi-Model Deployment with Enhanced vLLM Configurations @haitwang-cloud #371
[Bugfix] Fixing issues on the engine svc labels @dumb0002 #376
[Bugfix] Declare logger properly for protocols.py @corona10 #381
[Feat] Adding a tutorial for using vLLM v1 in production stack @YuhanLiu11 #390

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vllm-stack-0.1.2

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!