You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
Pull Request resolved: pytorch#5075
X-link: https://github.com/facebookresearch/FBGEMM/pull/2080
This diff generalizes the work in (D85155388) based on Gefei's diff D85631781 .
Compared to D85631781, we avoid registers warp shuffling by using 32b TMEM atoms.
This diff supports:
1. Different dtypes (fp8, bf16)
2. Different mtiles (128, 64)
Reviewed By: v0i0
Differential Revision: D85893883
fbshipit-source-id: 25e93e627c573a120ab46336d3f234064c5ae066
0 commit comments