Skip to content

Commit af59fa2

Browse files
author
weijinqian_v1
committed
[Refactor] add fia_v3 attention & remove other attention operator.
Signed-off-by: weijinqian_v1 <[email protected]>
1 parent 8bd9477 commit af59fa2

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

vllm_ascend/attention/attention_v1.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -560,8 +560,9 @@ def forward(
560560
value = value.contiguous()
561561

562562
if self.attn_type == AttentionType.ENCODER_ONLY:
563-
output = self._forward_encode(query, key, value, attn_metadata,
563+
attn_output = self._forward_encode(query, key, value, attn_metadata,
564564
output)
565+
output[:num_tokens] = attn_output[:num_tokens]
565566
return output
566567

567568
if len(kv_cache) > 1:

0 commit comments

Comments
 (0)