[CustomOp] Implement ApplyRotaryEmb CustomOp and register it #4667

shen-shanshan · 2025-12-03T08:37:27Z

What this PR does / why we need it?

Following vllm-project/vllm#29873, implement ApplyRotaryEmb CustomOp and register it.

Does this PR introduce any user-facing change?

How was this patch tested?

Run:

vllm serve /root/.cache/modelscope/hub/models/Qwen/Qwen2.5-VL-7B-Instruct \
--max_model_len 16384 \
--max-num-batched-tokens 16384 \
--tensor-parallel-size 2 \
--enforce-eager

Output:

{"id":"chatcmpl-9ab4de23690c85aa","object":"chat.completion","created":1764748509,"model":"/root/.cache/modelscope/hub/models/Qwen/Qwen2.5-VL-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","content":"The text in the image reads \"TONGYI Qwen.\" The word \"TONGYI\" is written in blue, and \"Qwen\" is written in gray. The font appears to be modern and clean, with \"TONGYI\" being slightly larger than \"Qwen.\" The design includes a geometric, abstract shape on the left side of the logo, which complements the text.","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null,"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null,"token_ids":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":78,"total_tokens":162,"completion_tokens":84,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null}

vLLM version: 86e178f7c4d8c3b0eaf3c8e3f810a83f63b90e24
vLLM main: vllm-project/vllm@86e178f

Signed-off-by: shen-shanshan <[email protected]>

github-actions · 2025-12-03T08:37:36Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces a custom operator AscendApplyRotaryEmb for applying rotary embeddings on Ascend hardware and registers it. The implementation refactors existing logic from patch_qwen2_5_vl.py. However, I've found a critical bug in the new AscendApplyRotaryEmb implementation where incorrect tensor shape manipulation will lead to a runtime error. The logic for preparing cos and sin tensors was copied from an implementation for a different model and is not compatible with the input tensor shapes for Qwen2.5-VL.

vllm_ascend/ops/rotary_embedding.py

Sparkheart · 2025-12-03T11:51:07Z

vllm_ascend/patch/worker/patch_qwen2_5_vl.py

        context_layer = einops.rearrange(context_layer,
                                         "(b s) h d -> s b (h d)",
                                         b=batch_size).contiguous()
+        # ----------------------------------------------------------------------


Meaningless comments, recommended for deletion.

Meaningless comments, recommended for deletion.

Ok. In fact, all these codes will finally be removed recently. 😀

github-actions · 2025-12-04T14:36:51Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

register apply_rotary_emb custom op

c94c66e

Signed-off-by: shen-shanshan <[email protected]>

github-actions bot added module:ops module:core labels Dec 3, 2025

shen-shanshan mentioned this pull request Dec 3, 2025

[RFC]: Remove VL Modeling Files #4084

Open

15 tasks

gemini-code-assist bot reviewed Dec 3, 2025

View reviewed changes

vllm_ascend/ops/rotary_embedding.py Show resolved Hide resolved

shen-shanshan mentioned this pull request Dec 3, 2025

[CustomOp] Extract apply_rotary_emb as CustomOp and unify the dispatch logic vllm-project/vllm#29873

Open

5 tasks

Sparkheart reviewed Dec 3, 2025

View reviewed changes

github-actions bot added the merge-conflicts label Dec 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CustomOp] Implement ApplyRotaryEmb CustomOp and register it #4667

[CustomOp] Implement ApplyRotaryEmb CustomOp and register it #4667

shen-shanshan commented Dec 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 3, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Sparkheart Dec 3, 2025

Uh oh!

shen-shanshan Dec 4, 2025

Uh oh!

github-actions bot commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[CustomOp] Implement ApplyRotaryEmb CustomOp and register it #4667

Are you sure you want to change the base?

[CustomOp] Implement ApplyRotaryEmb CustomOp and register it #4667

Conversation

shen-shanshan commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Dec 3, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Sparkheart Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

shen-shanshan Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shen-shanshan commented Dec 3, 2025 •

edited

Loading