Skip to content

Commit 8637f6e

Browse files
authored
calls AttentionMaskConverter._unmask_unattended for xpu device before (#42230)
sdpa Signed-off-by: Liu, Kaixuan <[email protected]>
1 parent 0e74a71 commit 8637f6e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/transformers/modeling_attn_mask_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -408,7 +408,7 @@ def _prepare_4d_causal_attention_mask_for_sdpa(
408408
# Attend to all tokens in masked rows from the causal_mask, for example the relevant first rows when
409409
# using left padding. This is required by F.scaled_dot_product_attention memory-efficient attention path.
410410
# Details: https://github.com/pytorch/pytorch/issues/110213
411-
if not is_tracing_ and expanded_4d_mask.device.type == "cuda":
411+
if not is_tracing_ and expanded_4d_mask.device.type in ["cuda", "xpu"]:
412412
expanded_4d_mask = AttentionMaskConverter._unmask_unattended(
413413
expanded_4d_mask, min_dtype=torch.finfo(inputs_embeds.dtype).min
414414
)

0 commit comments

Comments
 (0)