Skip to content
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions tests/quantization/finegrained_fp8/test_fp8.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@ def _patch_no_accelerator():
stack.enter_context(patch("torch.cuda.is_available", return_value=False))
if hasattr(torch, "xpu"):
stack.enter_context(patch("torch.xpu.is_available", return_value=False))
stack.enter_context(
patch("transformers.quantizers.quantizer_finegrained_fp8.is_torch_xpu_available", return_value=False)
)
Comment on lines +49 to +51
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird that we need to add it, but I guess it's fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because is_torch_xpu_available is used here. During the Setup phase, we called this func once via cls.quantized_model, and since it is decorated with @lru_cache, even if we enter _patch_no_accelerator, the value of is_torch_xpu_available read here is still True.

yield


Expand Down