Skip to content

Conversation

@njhill
Copy link
Member

@njhill njhill commented Nov 1, 2025

#26467 fixed compatibility of penalties sampling parameters with async scheduling but has a flaw that it breaks in cases where there is a mix of requests with and without penalties in the batch, specifically if a request with a penalties param starts while a batch without penalties is already running.

This is a fix for that case.

@njhill njhill added this to the v0.11.1 milestone Nov 1, 2025
@njhill njhill added the bug Something isn't working label Nov 1, 2025
@mergify mergify bot added the v1 label Nov 1, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug in async scheduling where a batch containing a mix of requests with and without penalties could fail. The fix involves replacing placeholder -1 token IDs with a valid token ID to prevent errors in downstream operations. The approach is sound. I've suggested a minor improvement to make the fix more robust by using vocab_size as the replacement value, which is already used as a padding/ignore value, instead of 0.

@mgoin mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 1, 2025
@njhill njhill merged commit c2ed069 into vllm-project:main Nov 1, 2025
46 checks passed
@njhill njhill deleted the fix-async-penalties branch November 1, 2025 17:51
zhaozuy pushed a commit to zhaozuy/vllm that referenced this pull request Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants