-
Notifications
You must be signed in to change notification settings - Fork 13.6k
CUDA: fix should_use_mmvf for ne11 == 1 #17085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA: fix should_use_mmvf for ne11 == 1 #17085
Conversation
|
Sorry, I just realized that I misdiagnosed the problem, I'll push another version. |
|
I tested this on your server |
7d65657 to
9779b58
Compare
9779b58 to
2bd5465
Compare
| if (src0_ne[0] % 2 != 0) { | ||
| return false; | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return false; | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Okay, so the problem should be fixed now. We were checking that the stride for dimension 0 is divisible by the sizes of e.g. |
Co-authored-by: Aman Gupta <[email protected]>
See #16988 (comment) .
The logic for preventing misaligned pointers on strides not divisible by the size of the data type is stricter than necessary,
for a single column we only need to check the stride of dimension 0the strides in question are forsrc0rather thansrc1, strictly speaking we would need to be checking that tensor too. @am17an can you share the model and command line you had used to test this?