[torchao] safetensors #42529

liangel-02 · 2025-12-01T16:52:32Z

Summary
updating safetensors logic to support all tensor subclasses generically

Test
verified integration with vllm
unit tests pass: python tests/quantization/torchao_integration/test_torchao.py -k TorchAoSafeSerializationTest

src/transformers/integrations/torchao.py

liangel-02 · 2025-12-01T19:17:08Z

cc @SunMarc @MekkCyber

src/transformers/quantizers/quantizer_torchao.py

MekkCyber

Thanks a lot for working on this

tests/quantization/torchao_integration/test_torchao.py

MekkCyber · 2025-12-02T07:40:40Z

src/transformers/core_model_loading.py

+            concrete_source_pattern = source_pattern
+            if isinstance(mapping, WeightConverter) and source_pattern is not None and "*" in source_pattern:
+                pattern_with_captures = source_pattern.replace("*", r"(.*?)")
+                pattern_regex = re.compile(f"^{pattern_with_captures}$")
+                concrete_source_pattern = extract_concrete_key_from_regex_pattern(
+                    original_key, source_pattern, pattern_regex
+                )


why do we need this ? could you explain a bit more why we need to change the regex handling here and in the other parts of the code?

for the torchao WeightConverter, the original code has hardcoded tensor data components (ie.
weight_qdata, weight_scale) mapped to the consolidated weight. however, these components change depending on the config used, with some also being optional. for max generality, i wanted to use wildcard matching (*_weight_* --> *weight*).

but with the current regex handling, the code was renaming the key with the literal "*weight*" rather than matching the regex. the changes i have extracts the prefix and uses that for source/target. lmk if this is an ok approach/if theres a better solution

src/transformers/quantizers/quantizer_torchao.py

liangel-02 · 2025-12-03T00:34:52Z

@MekkCyber to avoid adding torchao specific logic to the core model loader and unblock landing, i reverted the changes to the regex logic and added all other tensor data names. in the future, maybe we can implement a strategy that generalizes this matching in a better way?

cc @jerryzh168

src/transformers/integrations/torchao.py

jerryzh168 · 2025-12-03T02:58:04Z

src/transformers/quantizers/quantizer_torchao.py

        # check if the param_name is not in self.modules_to_not_convert
        if any(key + "." in param_name or key == param_name for key in self.modules_to_not_convert):
            return False
-        elif any(param_name.endswith(f":{x}") for x in self.full_ao_keys):


for this one, maybe just change it to (f"_{x} for x in self.full_ao_keys)" to be safe, at least with this we are confident it's correct. if there are no better general way to detect safetensors

i think this is achieving the same thing as L237, i think we only need one check

github-actions · 2025-12-03T03:52:24Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: torchao_integration

MekkCyber

Very nice alternative solution ! Thanks a lot

MekkCyber · 2025-12-04T07:34:19Z

src/transformers/quantizers/quantizer_torchao.py

+                    # TODO: incr flexibility by generalizing the source patterns to match the format of "_weight_"
+                    # note that the matching logic is greedy, so for ex, if _weight_scale is before _weight_scale_and_zero in this list, it will match _weight_scale always (this is incorrect)
+                    # thus, the order of source_patterns is intentional
+                    source_patterns=[
+                        "_weight_qdata",
+                        "_weight_scale_and_zero",
+                        "_weight_scale",
+                        "_weight_zero_point",
+                        "_weight_act_pre_scale",
+                    ],


Yes it is greedy but to match for example _weight_scale and not _weight_scale_and_zero you can just do something like _weight_scale$, but ordering the keys works as well 👍

MekkCyber · 2025-12-04T07:39:19Z

src/transformers/modeling_utils.py

-                print(name)
                setattr(self, name, decoder)


Thanks for catching this

MekkCyber · 2025-12-04T07:41:11Z

src/transformers/integrations/torchao.py

-            else:
-                param_data[f"{full_layer_name}:qdata"] = input_dict["weight:qdata"]
+            for suffix in input_dict.keys():
+                assert len(input_dict[suffix]) == 1


let's do an if/else and raise an error to follow the same pattern we use in transformers, but i'm not sure if it's necessary since if we have the suffix in input_dict it means we collected at least a tensor for that suffix

MekkCyber · 2025-12-04T07:42:54Z

src/transformers/integrations/torchao.py

+            full_layer_name: "model.layers.0.self_attn.k_proj"
+


Suggested change

full_layer_name: "model.layers.0.self_attn.k_proj"

full_layer_name: "model.layers.0.self_attn.k_proj.weight"

liangel-02 force-pushed the ao_safetensors branch 5 times, most recently from 8ee8375 to 32df8ec Compare December 1, 2025 18:52

jerryzh168 reviewed Dec 1, 2025

View reviewed changes

src/transformers/integrations/torchao.py Outdated Show resolved Hide resolved

jerryzh168 reviewed Dec 1, 2025

View reviewed changes

src/transformers/integrations/torchao.py Outdated Show resolved Hide resolved

jerryzh168 reviewed Dec 1, 2025

View reviewed changes

src/transformers/integrations/torchao.py Outdated Show resolved Hide resolved

liangel-02 force-pushed the ao_safetensors branch from 32df8ec to 78f0a39 Compare December 1, 2025 19:16

liangel-02 added 3 commits December 1, 2025 12:16

torchao safetensors

ad93162

working

c02b349

update torchao safetensors

a213c68

liangel-02 force-pushed the ao_safetensors branch from 78f0a39 to 543b810 Compare December 1, 2025 20:16

jerryzh168 reviewed Dec 1, 2025

View reviewed changes

src/transformers/quantizers/quantizer_torchao.py Outdated Show resolved Hide resolved

liangel-02 force-pushed the ao_safetensors branch 3 times, most recently from 5fa770a to 80af667 Compare December 1, 2025 21:41

MekkCyber reviewed Dec 2, 2025

View reviewed changes

after rebase

f83b74c

liangel-02 force-pushed the ao_safetensors branch from 80af667 to f83b74c Compare December 2, 2025 15:23

liangel-02 requested a review from MekkCyber December 2, 2025 19:13

jerryzh168 reviewed Dec 2, 2025

View reviewed changes

src/transformers/quantizers/quantizer_torchao.py Outdated Show resolved Hide resolved

liangel-02 force-pushed the ao_safetensors branch 2 times, most recently from 13b1d0e to 5f770e1 Compare December 3, 2025 00:18

remove regex

0a4e634

liangel-02 force-pushed the ao_safetensors branch from 5f770e1 to 0a4e634 Compare December 3, 2025 00:25

jerryzh168 reviewed Dec 3, 2025

View reviewed changes

src/transformers/integrations/torchao.py Outdated Show resolved Hide resolved

jerryzh168 reviewed Dec 3, 2025

View reviewed changes

make safetensors check more robust

d59bb71

liangel-02 force-pushed the ao_safetensors branch from b58354a to d59bb71 Compare December 3, 2025 04:04

MekkCyber approved these changes Dec 4, 2025

View reviewed changes

	full_layer_name: "model.layers.0.self_attn.k_proj"
	full_layer_name: "model.layers.0.self_attn.k_proj.weight"

[torchao] safetensors #42529

Are you sure you want to change the base?

[torchao] safetensors #42529

Conversation

liangel-02 commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liangel-02 commented Dec 1, 2025

Uh oh!

Uh oh!

MekkCyber left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MekkCyber Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liangel-02 Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

liangel-02 commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jerryzh168 Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

liangel-02 Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 3, 2025

Uh oh!

MekkCyber left a comment

Choose a reason for hiding this comment

Uh oh!

MekkCyber Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

MekkCyber Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

MekkCyber Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

MekkCyber Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

liangel-02 commented Dec 1, 2025 •

edited

Loading

MekkCyber Dec 2, 2025 •

edited

Loading

liangel-02 Dec 2, 2025 •

edited

Loading

liangel-02 commented Dec 3, 2025 •

edited

Loading