-
-
Notifications
You must be signed in to change notification settings - Fork 12.1k
[Feature] Default EPLB num_redundant_experts to minimum valid value #30614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Feature] Default EPLB num_redundant_experts to minimum valid value #30614
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a helpful feature by automatically calculating the minimum valid num_redundant_experts for EPLB when it's not specified. The changes to EPLBConfig and ParallelConfig are well-implemented, and the logic for the automatic computation in ModelConfig is sound. However, I've identified a gap in test coverage for the new computation logic, which is critical to ensure its correctness and prevent future regressions. Please see my detailed comment.
| def test_eplb_num_redundant_experts_default(): | ||
| """Test that num_redundant_experts defaults to None and can be set.""" | ||
| from vllm.config.parallel import EPLBConfig, ParallelConfig | ||
|
|
||
| # Test default is None | ||
| eplb_config = EPLBConfig() | ||
| assert eplb_config.num_redundant_experts is None | ||
|
|
||
| # Test explicit value | ||
| eplb_config_explicit = EPLBConfig(num_redundant_experts=4) | ||
| assert eplb_config_explicit.num_redundant_experts == 4 | ||
|
|
||
| # Test validation for negative value | ||
| with pytest.raises(ValueError, match="non-negative"): | ||
| EPLBConfig(num_redundant_experts=-1) | ||
|
|
||
| # Test ParallelConfig validation - EPLB disabled with None is OK | ||
| parallel_config = ParallelConfig( | ||
| enable_eplb=False, | ||
| enable_expert_parallel=False, | ||
| ) | ||
| # Should not raise - None is allowed when EPLB is disabled | ||
| assert parallel_config.eplb_config.num_redundant_experts is None | ||
|
|
||
| # Test ParallelConfig validation - EPLB disabled with non-zero value | ||
| with pytest.raises(ValueError, match="EPLB is not enabled"): | ||
| ParallelConfig( | ||
| enable_eplb=False, | ||
| enable_expert_parallel=False, | ||
| eplb_config=EPLBConfig(num_redundant_experts=4), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test function is a good start and covers the validation logic within EPLBConfig and ParallelConfig. However, it's missing tests for the core feature of this PR: the automatic computation of num_redundant_experts which happens in ModelConfig.verify_with_parallel_config.
To ensure the new logic is robust, please add test cases that cover the following scenarios:
- When
enable_eplbisTrueandnum_redundant_expertsisNone, verify that it's correctly computed asmax(0, ep_size - num_logical_experts). You'll need to mockModelConfig.get_num_experts()and settensor_parallel_sizeanddata_parallel_sizeinParallelConfigto test this. - When
enable_eplbisFalseandnum_redundant_expertsisNone, verify that it defaults to0.
Adding these tests will help prevent regressions and ensure the correctness of this important new feature.
|
Hi @majiayu000, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
efdf56b to
377b395
Compare
|
@WoosukKwon @youkaichao Can you help take a look at this PR? Thanks! |
ilmarkov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for you work! Left some comments
vllm/config/model.py
Outdated
| ) | ||
| # Minimum value ensures at least 1 local physical expert per rank: | ||
| # (num_logical_experts + num_redundant_experts) / ep_size >= 1 | ||
| min_redundant = max(0, ep_size - num_logical_experts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to set the min_redundant to a value closest divisible by ep_size. So that we would automatically could support unusual ep_sizes? Something like (ep_size - num_logical_experts % ep_size) % ep_size
vllm/config/parallel.py
Outdated
| If None (default), the minimum valid value will be computed automatically | ||
| based on the number of logical experts and the expert parallel size.""" | ||
|
|
||
| @model_validator(mode="after") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it has to be either in _validate_parallel_config or _validate_eplb_config not in separate method just for one config parameter.
tests/test_config.py
Outdated
| """ | ||
|
|
||
| # Test the formula logic directly | ||
| def compute_min_redundant(num_logical_experts: int, ep_size: int) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we want to test the executed code in the main codebase not the replicated formula.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Updated per your feedback:
- Formula now uses (ep_size - num_experts % ep_size) % ep_size for non-divisible cases
- Moved validation to _validate_parallel_config
- Tests now use parametrize without duplicating formula
|
Hi @majiayu000, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
tests/test_config.py
Outdated
| (7, 4, 1), # Non-divisible: (4 - 7%4) % 4 = (4-3)%4 = 1 | ||
| (1, 4, 3), # Single expert: (4 - 1%4) % 4 = 3 | ||
| ]) | ||
| def test_eplb_num_redundant_experts_auto_computation(num_experts, ep_size, expected): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this test you want to check parallel_config.eplb_config.num_redundant_experts is computed correctly at the initialization not the formula that you replicate here.
vllm/config/model.py
Outdated
| if parallel_config.enable_expert_parallel: | ||
| self._verify_with_expert_parallelism() | ||
|
|
||
| # Compute num_redundant_experts if not specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the updating config logic shouldn't be in the function named verify_. I would put it in post init of eplb_config.
ilmarkov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update! Added short comments on the test and logic location. Also, please, fix the pre-commit.
@tlrmchlsmth I recall, you mentioned that such automatic choice of num_redundant_experts would be helpful
|
Thanks! Updated per your feedback:
|
When EPLB is enabled but num_redundant_experts is not specified, automatically compute and use the minimum valid value based on: - Number of logical experts in the model - Expert parallel size (TP * DP) The minimum valid value ensures at least 1 local physical expert per rank: min_redundant = max(0, ep_size - num_logical_experts) This reduces friction when enabling EPLB for the first time and allows the same configuration to work across multiple EP sizes. Changes: - EPLBConfig.num_redundant_experts now defaults to None instead of 0 - ModelConfig.verify_with_parallel_config() computes the minimum value when num_redundant_experts is None and EPLB is enabled - Added validation that num_redundant_experts must be non-negative Fixes vllm-project#30075 Signed-off-by: majiayu000 <[email protected]> Signed-off-by: lif <[email protected]>
- Use formula (ep_size - num_experts % ep_size) % ep_size to support non-standard ep_size values where num_experts is not divisible by ep_size - Move validation from EPLBConfig._validate_num_redundant_experts to ParallelConfig._validate_parallel_config (consolidate validation logic) - Update tests to use @pytest.mark.parametrize and test actual formula instead of duplicating the computation logic 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> Signed-off-by: lif <[email protected]>
Address reviewer feedback: - Move num_redundant_experts computation logic from verify_with_parallel_config to a dedicated method compute_eplb_num_redundant_experts in ParallelConfig - Call the computation method in VllmConfig.__post_init__ before verification - Update tests to verify actual computed values instead of replicating formula - Add tests for EPLB disabled case and explicit value preservation - Fix nested if statement lint warning (SIM102) Signed-off-by: lif <[email protected]> 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]> Signed-off-by: lif <[email protected]>
be9971a to
0d58dcf
Compare
Summary
EPLB requires the number of redundant experts to be chosen up front. This PR changes
num_redundant_expertsto default toNone, which triggers automatic computation of the minimum valid value based on the model's configuration.Changes
EPLBConfig.num_redundant_expertsnow defaults toNoneinstead of0num_redundant_expertsisNone, the minimum valid value is computed as:num_redundant_expertsmust be non-negative when explicitly setBenefits
Test Plan
test_eplb_num_redundant_experts_defaultFixes #30075