Skip to content

Conversation

@linfeng-yuan
Copy link
Collaborator

@linfeng-yuan linfeng-yuan commented Dec 10, 2025

What this PR does / why we need it?

Currently, the initialization and fundamental functions of RecomputeScheduler are broken with vLLM v0.12.0. This PR fixes the conflicts of RecomputeScheduler and refactor its implementations by inheriting original Scheduler of vLLM. Meanwhile, this PR also supports async cheduling with recompute scheduler by implementing AsyncRecomputeScheduler which is simply inherited AsncyScheduler of vLLM and RecomputeScheduler of vLLM-Ascend with python MRO.

Does this PR introduce any user-facing change?

No. The switch naming is the same as v0.11.0 : recompute_scheduler_enable

How was this patch tested?

E2E serving with 2P1D dsv3.1 passed. The performance was the same as original vllm scheduler with async_scheduling and preempted requests in D Nodes are successfully transfered to Proxy and further to P Node. This significantly improves the performance and robustness of PD disaggregation deployments.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces significant refactoring to the scheduler components, aiming to fix recompute issues, support async scheduling, and align with vLLM 0.12.0. A new AsyncRecomputeScheduler is added, and the core RecomputeScheduler is refactored to inherit from the base vLLM Scheduler class. However, there are critical issues related to class definitions and naming consistency that need to be addressed to ensure the code is functional.

scheduler_cls: Union[str, Type[object]] = (
"vllm_ascend.core.recompute_scheduler.RecomputeScheduler")

@classmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The @dataclass decorator and the class definition for RecomputeSchedulerConfig appear to be missing. This will lead to a syntax error or NameError when the initialize_from_config method is called, as it's currently defined as a standalone method without a class context. The class definition, likely inheriting from SchedulerConfig, needs to be restored.

@dataclass
class RecomputeSchedulerConfig(SchedulerConfig):
    pass

    @classmethod

@linfeng-yuan linfeng-yuan force-pushed the fix_and_refactor_recompute_scheduler branch 3 times, most recently from 08c4581 to b968591 Compare December 10, 2025 15:28
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

…upport async scheduling & refactor recompute_scheduler.py

Signed-off-by: linfeng-yuan <[email protected]>
@linfeng-yuan linfeng-yuan force-pushed the fix_and_refactor_recompute_scheduler branch from b968591 to fce5b5d Compare December 10, 2025 18:31
Signed-off-by: linfeng-yuan <[email protected]>
@linfeng-yuan
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an AsyncRecomputeScheduler by inheriting from AsyncScheduler and RecomputeScheduler, and refactors the RecomputeScheduler to align with upstream vLLM changes. The RecomputeScheduler now inherits from vllm.v1.core.sched.scheduler.Scheduler, significantly reducing its __init__ method and moving much of its core logic to the base class. Key changes include updating RecomputeSchedulerConfig to accept VllmConfig and dynamically select between synchronous and asynchronous recompute schedulers, optimizing scheduling for async operations by avoiding extra steps, refining preemption logic to restore token budgets and related data, and adjusting speculative decoding token calculations. Additionally, the handling of encoder inputs and KV connector metadata is updated, and KV cache event collection and publishing are moved to the update_from_output method. The RecomputeSchedulerOutput dataclass now inherits from SchedulerOutput, simplifying its structure.

Signed-off-by: linfeng-yuan <[email protected]>
@linfeng-yuan linfeng-yuan force-pushed the fix_and_refactor_recompute_scheduler branch from ad094a0 to a78a835 Compare December 11, 2025 09:07
Signed-off-by: linfeng-yuan <[email protected]>
@linfeng-yuan linfeng-yuan force-pushed the fix_and_refactor_recompute_scheduler branch from adcc6cf to 90b77d3 Compare December 11, 2025 11:30
@wangxiyuan wangxiyuan merged commit 0fbe083 into vllm-project:main Dec 11, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants