Skip to content

Commit 5e1797e

Browse files
authored
fix attention for glm (#106)
1 parent 818c742 commit 5e1797e

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

dlinfer/vendor/ascend/torch_npu_ops.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,7 @@ def paged_decode_attention(
176176

177177
bs, _, dim = query.shape
178178
query = query.contiguous()
179+
attn_output = attn_output.contiguous()
179180
query = query.view(bs, 1, num_q_heads * dim)
180181
scale_value = 1.0 / math.sqrt(dim)
181182

0 commit comments

Comments
 (0)