Commit 8738b0c
authored
perf(dipu): faster aten::mul in cuda & muxi (#855)
* faster aten::mul in cuda
* improve the code format
* loose the check for scalar tensor
* improve the code
* let the logic of mul diff on different devices.
* Update autogen_diopi_wrapper.py
* Update diopi_functions.yaml
* Update OpUtils.hpp1 parent ee58ddc commit 8738b0c
File tree
5 files changed
+27
-8
lines changed- dipu
- scripts/autogen_diopi_wrapper
- torch_dipu/csrc_dipu/aten/ops
5 files changed
+27
-8
lines changedLines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
650 | 650 | | |
651 | 651 | | |
652 | 652 | | |
653 | | - | |
| 653 | + | |
654 | 654 | | |
655 | 655 | | |
656 | | - | |
| 656 | + | |
657 | 657 | | |
658 | 658 | | |
659 | 659 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
159 | 159 | | |
160 | 160 | | |
161 | 161 | | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
162 | 167 | | |
163 | 168 | | |
164 | 169 | | |
165 | 170 | | |
166 | 171 | | |
167 | 172 | | |
168 | 173 | | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
169 | 179 | | |
170 | 180 | | |
171 | 181 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
82 | | - | |
| 82 | + | |
83 | 83 | | |
84 | | - | |
85 | | - | |
86 | 84 | | |
87 | | - | |
| 85 | + | |
88 | 86 | | |
89 | | - | |
| 87 | + | |
90 | 88 | | |
91 | 89 | | |
92 | 90 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
| 43 | + | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
254 | 254 | | |
255 | 255 | | |
256 | 256 | | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
257 | 268 | | |
258 | 269 | | |
0 commit comments