[Fix] Fix FIA `query` and `query_start_loc` shape mismatch error #4518

momo609 · 2025-11-28T02:24:53Z

What this PR does / why we need it?

Due to the requirement of the FIA operator that the query.shape[0] must match actual_seq_len[-1], in graph mode and multi-DP scenarios, the query is padded to the size of num_input_token. This leads to validation errors during tiling in the operator. However, since the padding is applied at the end of the query, it does not affect the actual execution result of the operator, and the precision remains unaffected.

Our modification padding both actual_seq_len and actual_seq_len_kv to resolve the validation issue in the operator.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

github-actions · 2025-11-28T02:25:01Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request aims to fix a validation error in the FIA operator by adjusting the padding logic for queries in graph mode and multi-DP scenarios. The changes correctly generalize the padding logic and fix a bug in how sequence lengths and token locations were padded. However, I've identified a critical issue where the padding for block_table is inconsistent with the padding for seq_lens, which will likely lead to a shape mismatch and runtime errors. The proposed fix addresses this inconsistency.

Angazenn · 2025-11-28T02:56:53Z

Please check whether padding block_table is necessary and whether it causes potential excessive memory consumption. cc @yiz-liu

yiz-liu · 2025-11-28T06:57:25Z

Please check whether padding block_table is necessary and whether it causes potential excessive memory consumption. cc @yiz-liu

Why did we pad this in the first place? @Angazenn

yiz-liu · 2025-11-28T07:03:40Z

Please check whether padding block_table is necessary and whether it causes potential excessive memory consumption. cc @yiz-liu

Why did we pad this in the first place? @Angazenn

I think it's OK to just remove it, better check it with op developer.

Signed-off-by: wangxiaoxin-sherie <[email protected]>

…m-project#4518) ### What this PR does / why we need it? Due to the requirement of the FIA operator that the **query.shape[0]** must match **actual_seq_len[-1]**, in graph mode and multi-DP scenarios, the query is padded to the size of **num_input_token**. This leads to validation errors during tiling in the operator. However, since the padding is applied at the end of the query, it does not affect the actual execution result of the operator, and the precision remains unaffected. <img width="2434" height="49" alt="image" src="https://github.com/user-attachments/assets/63520816-fbc3-4382-82b9-89dbb1492f6c" /> Our modification padding both **actual_seq_len** and **actual_seq_len_kv** to resolve the validation issue in the operator. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 Signed-off-by: wangxiaoxin-sherie <[email protected]> Co-authored-by: wangxiaoxin-sherie <[email protected]> Signed-off-by: Che Ruan <[email protected]>

gemini-code-assist bot reviewed Nov 28, 2025

View reviewed changes

momo609 force-pushed the adde2e branch from dcdcfdf to fa8f9e2 Compare November 28, 2025 02:28

weijinqian0 approved these changes Nov 28, 2025

View reviewed changes

momo609 force-pushed the adde2e branch from fa8f9e2 to 07002e1 Compare November 28, 2025 06:10

yiz-liu changed the title ~~fix fia error.~~ [Fix] Fix FIA query and query_start_loc shape mismatch error Nov 28, 2025

momo609 force-pushed the adde2e branch 4 times, most recently from 98841f5 to abcb13f Compare December 1, 2025 08:32

MengqingCao added ready read for review ready-for-test start test by label for PR labels Dec 2, 2025

momo609 force-pushed the adde2e branch 3 times, most recently from 8e02166 to 5b61368 Compare December 2, 2025 03:24

fix fia error.

b9543b7

Signed-off-by: wangxiaoxin-sherie <[email protected]>

momo609 force-pushed the adde2e branch from 5b61368 to b9543b7 Compare December 2, 2025 06:11

yiz-liu approved these changes Dec 2, 2025

View reviewed changes

weijinqian0 merged commit 15dc01f into vllm-project:main Dec 3, 2025
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Fix] Fix FIA `query` and `query_start_loc` shape mismatch error #4518

[Fix] Fix FIA `query` and `query_start_loc` shape mismatch error #4518

Uh oh!

momo609 commented Nov 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 28, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Angazenn commented Nov 28, 2025

Uh oh!

yiz-liu commented Nov 28, 2025 •

edited

Loading

Uh oh!

yiz-liu commented Nov 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[Fix] Fix FIA query and query_start_loc shape mismatch error #4518

[Fix] Fix FIA query and query_start_loc shape mismatch error #4518

Uh oh!

Conversation

momo609 commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 28, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Angazenn commented Nov 28, 2025

Uh oh!

yiz-liu commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiz-liu commented Nov 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[Fix] Fix FIA `query` and `query_start_loc` shape mismatch error #4518

[Fix] Fix FIA `query` and `query_start_loc` shape mismatch error #4518

momo609 commented Nov 28, 2025 •

edited

Loading

yiz-liu commented Nov 28, 2025 •

edited

Loading