BSND to TND and FA_UPDATE replacement #3776

pichangping · 2025-10-27T02:14:57Z

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.11.0rc3
vLLM main: vllm-project/vllm@c9461e0

Signed-off-by: pichangping <[email protected]>

github-actions · 2025-10-27T02:15:20Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

gemini-code-assist

Code Review

This pull request refactors the attention mechanism to use the TND tensor format directly, removing the need for packing and unpacking between TND and BSND formats. It also replaces a manual attention update implementation with a more optimized npu_attention_update kernel.

I've found a critical bug in the graph capture path for decoding, where several typos would lead to an AttributeError. Additionally, I've identified a potential issue with a hardcoded attention mask size that could cause problems for models with long context lengths.

gemini-code-assist · 2025-10-27T02:16:55Z

vllm_ascend/attention/attention_v1.py

+                 self.key_cache.shape[1], attn_metadata.decode.
+                 num_computed_tokens_of_cp_dcp[:, self.cp_rank, self.dcp_rank],


There are several typos on these lines within the graph capture logic that will cause an AttributeError at runtime:

attn_metadata.decode should be attn_metadata.decode_meta.

num_computed_tokens_of_cp_dcp should be num_computed_tokens_of_pcp_dcp.

self.cp_rank should be self.pcp_rank.

These seem to be typos introduced during refactoring, as the previous version of the code used the correct names.

Suggested change

self.key_cache.shape[1], attn_metadata.decode.

num_computed_tokens_of_cp_dcp[:, self.cp_rank, self.dcp_rank],

self.key_cache.shape[1], attn_metadata.decode_meta.

num_computed_tokens_of_pcp_dcp[:, self.pcp_rank, self.dcp_rank],

vllm_ascend/worker/model_runner_v1.py

github-actions · 2025-10-27T02:42:22Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

BSND to TND and FA_UPDATE replacement

7212e5a

Signed-off-by: pichangping <[email protected]>

github-actions bot added the merge-conflicts label Oct 27, 2025

gemini-code-assist bot reviewed Oct 27, 2025

View reviewed changes

Merge branch 'main' into Performance-optimization

4d703c1

github-actions bot removed the merge-conflicts label Oct 27, 2025

pichangping closed this Oct 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BSND to TND and FA_UPDATE replacement #3776

BSND to TND and FA_UPDATE replacement #3776

Uh oh!

pichangping commented Oct 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Oct 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 27, 2025

Uh oh!

Uh oh!

github-actions bot commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		self.key_cache.shape[1], attn_metadata.decode.
		num_computed_tokens_of_cp_dcp[:, self.cp_rank, self.dcp_rank],

BSND to TND and FA_UPDATE replacement #3776

BSND to TND and FA_UPDATE replacement #3776

Uh oh!

Conversation

pichangping commented Oct 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Oct 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pichangping commented Oct 27, 2025 •

edited by github-actions bot

Loading