Skip to content

Commit 8169b7c

Browse files
committed
fix dsv32 bmm_ops
Signed-off-by: hust17yixuan <[email protected]>
1 parent f0876b5 commit 8169b7c

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

vllm_ascend/attention/sfa_v1.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -485,7 +485,8 @@ def get_and_maybe_dequant_weights(layer: LinearBase):
485485

486486
def _v_up_proj(self, x):
487487
if x.dtype in [torch.float16, torch.bfloat16] \
488-
and hasattr(torch.ops._C_ascend, "batch_matmul_transpose"):
488+
and hasattr(torch.ops._C_ascend, "batch_matmul_transpose") \
489+
and not self.enable_sfa_cp:
489490
x = x.view(-1, self.num_heads, self.kv_lora_rank)
490491
b, _, _ = x.shape
491492
res = torch.empty((b, self.num_heads, self.v_head_dim),

0 commit comments

Comments
 (0)