-
Notifications
You must be signed in to change notification settings - Fork 50
Check for WMMA instead of MFMA when assigning datatypes for attention #2125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
@dorde-antic CI has failed on Navi3x. It is still picking f32 attentions |
Weird since on Navi4x it didn't (and the logic we use is the same) |
@dorde-antic This is blocking weekly CI for upstream merge. Please give priority to this PR |
|
@umangyadav I would rather focus on merging this #2123 as a temp solution than merging this PR. #2123 solves both f32 attn thing and thing that we don't try every possible combination on MITuna |
Motivation
Resolves https://github.com/ROCm/rocMLIR-internal/issues/2142
Technical Details
Checks for WMMA in perfRunner when assigning datatypes for attention.
Test Plan
Weekly CI - Tuning phase
Test Result
CI RUN
Successfully filtered out f32 attention configs on WMMA architecture (check attention tuning results on gfx1100)
Run failed for other issues related to tuning
Submission Checklist