[Feat] Add per-model runtimeClassName configuration support #755

HanFa · 2025-11-12T06:58:08Z

Enable model-specific runtimeClassName overrides with global fallback. This allows different models to use different deployment runtime classes while maintaining a default global setting.

FIX #745

BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE

Make sure the code changes pass the pre-commit checks.
Sign-off your commit by using -s when doing git commit
Try to classify PRs for easy understanding of the type of changes, such as [Bugfix], [Feat], and [CI].

Detailed Checklist (Click to Expand)

Thank you for your contribution to production-stack! Before submitting the pull request, please ensure the PR meets the following criteria. This helps us maintain the code quality and improve the efficiency of the review process.

PR Title and Classification

Please try to classify PRs for easy understanding of the type of changes. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:

[Bugfix] for bug fixes.
[CI/Build] for build or continuous integration improvements.
[Doc] for documentation fixes and improvements.
[Feat] for new features in the cluster (e.g., autoscaling, disaggregated prefill, etc.).
[Router] for changes to the vllm_router (e.g., routing algorithm, router observability, etc.).
[Misc] for PRs that do not fit the above categories. Please use this sparingly.

Note: If the PR spans more than one category, please include all relevant prefixes.

Code Quality

The PR need to meet the following code quality standards:

Pass all linter checks. Please use pre-commit to format your code. See README.md for installation.
The code need to be well-documented to ensure future contributors can easily understand the code.
Please include sufficient tests to ensure the change is stay correct and robust. This includes both unit tests and integration tests.

DCO and Signed-off-by

When contributing changes to this project, you must agree to the DCO. Commits must include a Signed-off-by: header which certifies agreement with the terms of the DCO.

Using -s with git commit will automatically add this header.

What to Expect for the Reviews

We aim to address all PRs in a timely manner. If no one reviews your PR within 5 days, please @-mention one of YuhanLiu11
, Shaoting-Feng or ApostaC.

gemini-code-assist · 2025-11-12T06:58:25Z

Summary of Changes

Hello @HanFa, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the flexibility of model deployments by enabling per-model runtimeClassName configuration within the Helm charts. It allows users to define a default runtimeClassName that applies globally, while also providing the option to override this setting for specific models. This capability is crucial for scenarios requiring different execution environments or specialized runtimes for various models, improving resource management and isolation within the Kubernetes cluster.

Highlights

Per-Model RuntimeClass Configuration: Introduced the ability to specify a Kubernetes runtimeClassName for individual models, allowing for fine-grained control over their execution environments.
Global RuntimeClass Fallback: Implemented a fallback mechanism where models without a specific runtimeClassName will inherit the value from a global servingEngineSpec.runtimeClassName setting.
Helm Chart Updates: Modified the deployment-vllm-multi.yaml and ray-cluster.yaml templates to incorporate the new runtimeClassName logic for both standalone deployments and Ray cluster nodes.
New Unit Tests: Added a new Helm unit test file (runtimeClassName_test.yaml) to thoroughly validate the correct application of both global and model-specific runtimeClassName settings.

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/functionality-helm-chart.yml

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a valuable feature by allowing per-model runtimeClassName overrides with a global fallback. The implementation in the Helm templates is clean and effectively uses the default function. The addition of unit tests is also a great step. My review includes a couple of suggestions to enhance the clarity of the example values and to improve test coverage by adding checks for more edge cases. These changes will make the new feature more robust and easier for users to understand.

helm/tests/runtimeClassName_test.yaml

helm/values-example.yaml

HanFa · 2025-11-12T07:06:35Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a valuable feature for per-model runtimeClassName configuration, allowing for more flexible deployments. The implementation is on the right track, but I've identified a subtle issue in the fallback logic. The use of the default function doesn't correctly handle cases where a model-specific runtimeClassName is explicitly set to an empty string to override a global setting. I've provided suggestions to use a more robust ternary and hasKey combination to fix this. Additionally, I've recommended adding a new test case to ensure this specific override scenario is covered. The documentation and example value changes are clear and well-executed.

helm/templates/deployment-vllm-multi.yaml

helm/templates/ray-cluster.yaml

helm/tests/runtimeClassName_test.yaml

Signed-off-by: Fang Han <[email protected]>

zerofishnoodles

LGTM

HanFa force-pushed the main branch from 027d5b0 to 30c1b8c Compare November 12, 2025 06:59

gemini-code-assist bot reviewed Nov 12, 2025

View reviewed changes

helm/tests/runtimeClassName_test.yaml Show resolved Hide resolved

helm/values-example.yaml Outdated Show resolved Hide resolved

HanFa force-pushed the main branch from 30c1b8c to 8938fbd Compare November 12, 2025 07:03

gemini-code-assist bot reviewed Nov 12, 2025

View reviewed changes

helm/templates/deployment-vllm-multi.yaml Outdated Show resolved Hide resolved

helm/templates/ray-cluster.yaml Outdated Show resolved Hide resolved

helm/templates/ray-cluster.yaml Outdated Show resolved Hide resolved

helm/tests/runtimeClassName_test.yaml Show resolved Hide resolved

HanFa force-pushed the main branch 2 times, most recently from 76ee68c to 2ded1d9 Compare November 12, 2025 21:17

HanFa mentioned this pull request Nov 12, 2025

feature: multiple runtime classes #745

Closed

HanFa force-pushed the main branch from 2ded1d9 to 29706d6 Compare November 12, 2025 22:03

[Feat] Add per-model runtimeClassName configuration support

5cf7974

Signed-off-by: Fang Han <[email protected]>

HanFa force-pushed the main branch from 29706d6 to 5cf7974 Compare November 12, 2025 23:44

zerofishnoodles approved these changes Nov 19, 2025

View reviewed changes

zerofishnoodles added 3 commits November 18, 2025 16:59

Merge branch 'main' into main

43d95c1

Merge branch 'main' into main

9ef4fc3

Merge branch 'main' into main

fc5afcc

zerofishnoodles merged commit 62cf67b into vllm-project:main Nov 19, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feat] Add per-model runtimeClassName configuration support #755

[Feat] Add per-model runtimeClassName configuration support #755

HanFa commented Nov 12, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

HanFa commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zerofishnoodles left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Feat] Add per-model runtimeClassName configuration support #755

[Feat] Add per-model runtimeClassName configuration support #755

Conversation

HanFa commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Title and Classification

Code Quality

DCO and Signed-off-by

What to Expect for the Reviews

Uh oh!

gemini-code-assist bot commented Nov 12, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

HanFa commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zerofishnoodles left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HanFa commented Nov 12, 2025 •

edited

Loading