Commit b834f94
authored
[tosa] : Add e2e support for quantized matmul. (#4371)
This PR enables e2e test for quantized torch.mm and it's other variants
through the `tosa` path.
`torch` IR for quantized matmul is shown in the following snippet:
```
%2 = torch.aten._make_per_tensor_quantized_tensor %0, %float2.150000e-02, %int-25 : !torch.vtensor<[3,4],si8>, !torch.float, !torch.int -> !torch.vtensor<[3,4],!torch.qint8>
%3 = torch.aten._make_per_tensor_quantized_tensor %1, %float1.760000e-02, %int18 : !torch.vtensor<[4,3],si8>, !torch.float, !torch.int -> !torch.vtensor<[4,3],!torch.qint8>
%4 = torch.aten.mm %2, %3 : !torch.vtensor<[3,4],!torch.qint8>, !torch.vtensor<[4,3],!torch.qint8> -> !torch.vtensor<[3,3],!torch.qint32>
%5 = torch.aten.int_repr %4 : !torch.vtensor<[3,3],!torch.qint32> -> !torch.vtensor<[3,3],si32>
%6 = torch.aten._make_per_tensor_quantized_tensor %5, %float3.784000e-04, %int0 : !torch.vtensor<[3,3],si32>, !torch.float, !torch.int -> !torch.vtensor<[3,3],!torch.qint32>
%7 = torch.aten.dequantize.tensor %6 : !torch.vtensor<[3,3],!torch.qint32> -> !torch.vtensor<[3,3],f32>
```
1. This change adds legalizations for
`_make_per_tensor_quantized_tensor`, `int_repr` which are basically cast
operations. The former op carries the zero-point/scale information for
(de)quantizing values.
2. Legalization for `dequantize.tensor` is also added which is the usual
dequantization op.
3. Legalization for `matmul` is fixed to infer the zero-point
information from the source `_make_per_tensor_quantized_tensor` ops for
the `matmul` operands. Scale doesn't need to be considered, as it will
be taken care of correctly at the output via `FuseQuantizedOps`
transform.1 parent 06f72e4 commit b834f94
File tree
7 files changed
+284
-92
lines changed- include/torch-mlir/Conversion/Utils
- lib/Conversion
- TorchToLinalg
- TorchToTosa
- Utils
- projects/pt1/e2e_testing
- test/Conversion/TorchToTosa
7 files changed
+284
-92
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
21 | 31 | | |
22 | 32 | | |
23 | 33 | | |
| |||
107 | 117 | | |
108 | 118 | | |
109 | 119 | | |
110 | | - | |
111 | | - | |
112 | | - | |
| 120 | + | |
113 | 121 | | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | 122 | | |
118 | 123 | | |
119 | 124 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | 31 | | |
38 | 32 | | |
39 | 33 | | |
| |||
0 commit comments