-
Notifications
You must be signed in to change notification settings - Fork 624
drop ascend scheduler #4498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drop ascend scheduler #4498
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request removes the ascend_scheduler feature. The changes are consistent across documentation, examples, tests, and source code, reflecting the removal of this functionality. However, I've identified a critical issue in one of the test files where the modification of server arguments has resulted in a malformed command, which will likely cause the test to fail. I've provided a suggestion to fix it.
d896281 to
153f779
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
153f779 to
dc5637d
Compare
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request removes the custom Ascend scheduler and related configurations, falling back to the default vLLM scheduler. This is a significant cleanup, touching documentation, tests, examples, and source code. The changes are mostly deletions and appear correct and consistent with the PR's goal. However, I found an issue in the documentation for TorchAirGraph. The examples are now incorrect because they enable TorchAirGraph without disabling chunked prefill, which is not a supported combination. This will lead to runtime errors for users following the guide. I've provided suggestions to fix the examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request removes the ascend_scheduler and related configurations, aiming to use the default vLLM scheduler. The changes are extensive, touching documentation, examples, tests, and core logic to remove all traces of the ascend_scheduler. The refactoring is clean and aligns with the stated objective.
My main concerns are a potential change in default behavior for chunked prefill, and the removal of tests for SchedulerDynamicBatch which appears to still be in use. I've added critical and high severity comments regarding these issues.
|
CI passed here: https://github.com/vllm-project/vllm-ascend/actions/runs/19735428628/job/56546065316?pr=4498 qwen3-next test failed due to chunked prefill is enabled by default, now qwen3-next with mtp doesn't work. Let's disable it first. |
dc5637d to
9b3d1b5
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
9b3d1b5 to
38eb2e9
Compare
whx-sjtu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have confirmed that no multi-modal models depend on AscendScheduler anymore. LGTM.
Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: wangxiyuan <[email protected]>
38eb2e9 to
8db5756
Compare
This reverts commit f10acdd.
Reverts #4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <[email protected]>
Reverts vllm-project#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: Che Ruan <[email protected]>
Reverts vllm-project#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Che Ruan <[email protected]>
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill. Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler. - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: Che Ruan <[email protected]>
Reverts vllm-project#4498 - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: Che Ruan <[email protected]>
Ascend scheduler was added for non chunk prefill case before, since that the npu ops didn't work well with chunked prefill.
Now the ops with chunked prefill work better, it's time to remove the ascend scheduler to use vLLM default scheduler.