vllm-stack-0.1.3
The stack deployment of vLLM
Changes made
- [Feat] add extraVolumes and extraVolumeMounts options @BrianPark314 (#396 )
- [Bugfix] fix(services): make post_request callback not dependent on semantic_cache @ant-ms (#399 )
- [Feat] Support for manual scheduling of a engine pod @dumb0002 (#400 )
- [Bugfix] add miss argument type set @googs1025 (#401)
- [Feat] add sentry sdk and cli args @pwuersch (#395 )
- [Doc] Added documentation about uninstalling previous minikube instal @insukim1994 (#405 )
- [Feat] KV cache aware routing @YuhanLiu11 (#403 )
- [Feat] add event when Reconciling configmap failed @googs1025 (#402 )
- [Misc] Update helm chart for v1 @YuhanLiu11 (#412 )
- [Bugfix] fix(parser): fix dynamic config not working @max-wittig (#413 )
- [feat] add model aliases @max-wittig (#397 )
- [Misc] use schema https://json-schema.org/draft/2020-12/schema @sh1ng (#423 )
- [Feat] Add initial CRD support for production stack @royyhuang (#415 )
- [Feat] Prefix aware routing implementation based on hash trie @KuntaiDu (#432)
- [Feat] Simple Gateway inference extension integration @YuhanLiu11 (#436)
- [Feat] Adding support for disaggregated prefill based on vLLM v1 @YuhanLiu11 (#435)
- refactor: Replace services list with a single service object @googs1025 (#409)
- [Feat][Router] add static-model-types argument @max-wittig (#430 )
- [CI/CD] Adding CI/CD tests for CRDs @YuhanLiu11 (#452 )
- Switch context in CI @Shaoting-Feng (#451)
- chore: add unittest coverage @max-wittig (#449)
- Feat/basic pipeline parallelism @insukim1994 (#422)
- feat: add endpoint health checks to static router @max-wittig (#428)
- [Feat][lora] add lora operator and modify vllm router to support @zerofishnoodles (#446)