[Test] Adjust abort sleep time to reduce AsyncLLM test flake #27827

njhill · 2025-10-30T19:14:32Z

It is tricky to test this from the outside in a deterministic way that doesn't rely on timing.

Example occurrence: https://buildkite.com/vllm/ci/builds/36825#019a3209-d6cb-4ae2-94a2-4aab39752d23

It is tricky to test this from the outside in a deterministic way that doesn't rely on timing. Signed-off-by: Nick Hill <[email protected]>

gemini-code-assist

Code Review

This pull request aims to fix a flaky test by increasing a sleep duration. While this may reduce the frequency of the flake, it's not a deterministic solution and can slow down the test suite. My review provides a suggestion for a more robust, event-based synchronization mechanism to replace the fixed sleep, which would eliminate the race condition and improve test reliability. This approach could also be applied to other tests in the file that rely on asyncio.sleep for synchronization.

gemini-code-assist · 2025-10-30T19:16:31Z

tests/v1/engine/test_async_llm.py


        # Let requests start
-        await asyncio.sleep(0.5)
+        await asyncio.sleep(1.0)


While increasing the sleep duration might mitigate the flakiness, it's not a deterministic fix and can make the test suite slower. A more robust approach would be to wait for a signal that the requests to be aborted have actually started processing in the engine. This would avoid race conditions and make the test more reliable.

One way to achieve this is to modify the generate test helper to signal an asyncio.Event after it receives the first output chunk for a request. The test can then wait on these events before calling abort.

Here's an example of how you could modify the generate function and this test:

1. Modify generate to accept and signal an event:

async def generate( # ... other args started_event: asyncio.Event | None = None, ) -> tuple[int, str]: # ... first_chunk = True async for out in engine.generate(...): if first_chunk: if started_event: started_event.set() first_chunk = False # ... rest of the function

2. Use the event in test_multi_abort:

# ... tasks: list[asyncio.Task] = [] started_events = [] abort_request_indices = set(REQUEST_IDS_TO_ABORT) for idx, request_id in enumerate(request_ids): # ... event = None if idx in abort_request_indices: event = asyncio.Event() started_events.append(event) tasks.append( asyncio.create_task( generate( # ... args started_event=event ) ) ) # Wait for requests to start instead of sleeping await asyncio.gather(*(event.wait() for event in started_events)) # Use multi-abort to abort multiple requests at once # ...

This approach ensures that we only attempt to abort requests that are confirmed to be running, making the test more reliable and independent of timing. Since this is a common pattern in the test file, applying this change to other tests that use asyncio.sleep for synchronization (like test_abort_final_output) would also improve their robustness.

[Test] Adjust abort sleep time to reduce AsyncLLM test flake

5260cbd

It is tricky to test this from the outside in a deterministic way that doesn't rely on timing. Signed-off-by: Nick Hill <[email protected]>

mergify bot added the v1 label Oct 30, 2025

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 30, 2025

gemini-code-assist bot reviewed Oct 30, 2025

View reviewed changes

njhill marked this pull request as draft October 31, 2025 04:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Test] Adjust abort sleep time to reduce AsyncLLM test flake #27827

[Test] Adjust abort sleep time to reduce AsyncLLM test flake #27827

njhill commented Oct 30, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

[Test] Adjust abort sleep time to reduce AsyncLLM test flake #27827

Are you sure you want to change the base?

[Test] Adjust abort sleep time to reduce AsyncLLM test flake #27827

Conversation

njhill commented Oct 30, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

njhill commented Oct 30, 2025 •

edited by github-actions bot

Loading