Skip to content

Conversation

@ck-hw-1018
Copy link
Contributor

@ck-hw-1018 ck-hw-1018 commented Oct 25, 2025

What this PR does / why we need it?

This PR adds a qwq case for nightly test for qwen-qwq on A3 ,we need test them daily

Does this PR introduce any user-facing change?

no

How was this patch tested?

by running the test

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR adds a new end-to-end test for the Qwen/QwQ-32B model. The overall structure is good, but I've found an area for improvement in how the server arguments are constructed. The current implementation is brittle and hard to maintain. I've provided a suggestion to make it more robust and readable.

Comment on lines +84 to +101
server_args = [
"--tensor-parallel-size",
str(tp_size), "--port",
str(port), "--max-model-len", "36864", "--max-num-batched-tokens",
"36864", "--block-size", "128", "--trust-remote-code",
"--gpu-memory-utilization", "0.9", "--compilation_config",
'{"cudagraph_mode":"FULL_DECODE_ONLY", "cudagraph_capture_sizes": [1, 8, 24, 48, 60]}',
"--reasoning-parser", "deepseek_r1", "--distributed_executor_backend",
"mp"
]
if mode == "single":
server_args.remove("--compilation_config")
server_args.remove(
'{"cudagraph_mode":"FULL_DECODE_ONLY", "cudagraph_capture_sizes": [1, 8, 24, 48, 60]}'
)
server_args.append("--additional-config")
server_args.append('{"ascend_scheduler_config":{"enabled":true}}')
server_args.append("--enforce-eager")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current method of constructing server_args by defining a default list and then modifying it with list.remove() is brittle and can lead to runtime errors. If the initial list is changed, the remove() calls might fail with a ValueError. Additionally, the long JSON string for compilation_config is duplicated, making the code harder to maintain.

It's better to build the argument list conditionally from common and mode-specific parts. This approach is more robust, readable, and avoids duplicating configuration strings.

    server_args = [
        "--tensor-parallel-size",
        str(tp_size), "--port",
        str(port), "--max-model-len", "36864", "--max-num-batched-tokens",
        "36864", "--block-size", "128", "--trust-remote-code",
        "--gpu-memory-utilization", "0.9",
    ]
    if mode == "single":
        server_args.extend([
            "--additional-config",
            '{"ascend_scheduler_config":{"enabled":true}}',
            "--enforce-eager",
        ])
    else:  # aclgraph
        server_args.extend([
            "--compilation_config",
            '{"cudagraph_mode":"FULL_DECODE_ONLY", "cudagraph_capture_sizes": [1, 8, 24, 48, 60]}',
        ])
    server_args.extend([
        "--reasoning-parser", "deepseek_r1", "--distributed_executor_backend",
        "mp"
    ])

@ck-hw-1018
Copy link
Contributor Author

@jiangyunfan1
Copy link
Contributor

LGTM

@wangxiyuan wangxiyuan merged commit 7572939 into vllm-project:main Oct 25, 2025
6 checks passed
luolun pushed a commit to luolun/vllm-ascend that referenced this pull request Nov 19, 2025
### What this PR does / why we need it?
This PR adds a qwq case for nightly test for qwen-qwq on A3 ,we need
test them daily

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
by running the test


- vLLM version: v0.11.0rc3
- vLLM main:
vllm-project/vllm@c9461e0

---------

Signed-off-by: ckhw <[email protected]>
Signed-off-by: luolun <[email protected]>
hwhaokun pushed a commit to hwhaokun/vllm-ascend that referenced this pull request Nov 19, 2025
### What this PR does / why we need it?
This PR adds a qwq case for nightly test for qwen-qwq on A3 ,we need
test them daily

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
by running the test

- vLLM version: v0.11.0rc3
- vLLM main:
vllm-project/vllm@c9461e0

---------

Signed-off-by: ckhw <[email protected]>
Signed-off-by: hwhaokun <[email protected]>
NSDie pushed a commit to NSDie/vllm-ascend that referenced this pull request Nov 24, 2025
### What this PR does / why we need it?
This PR adds a qwq case for nightly test for qwen-qwq on A3 ,we need
test them daily

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
by running the test

- vLLM version: v0.11.0rc3
- vLLM main:
vllm-project/vllm@c9461e0

---------

Signed-off-by: ckhw <[email protected]>
Signed-off-by: nsdie <[email protected]>
Clorist33 pushed a commit to Clorist33/vllm-ascend that referenced this pull request Dec 10, 2025
### What this PR does / why we need it?
This PR adds a qwq case for nightly test for qwen-qwq on A3 ,we need
test them daily

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
by running the test


- vLLM version: v0.11.0rc3
- vLLM main:
vllm-project/vllm@c9461e0

---------

Signed-off-by: ckhw <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants