-
Notifications
You must be signed in to change notification settings - Fork 2
Implementing accumulate functionality and updating comments + README #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
| for (int i = 0; i < Ncvec; i += 2) | ||
| { | ||
| vab[i] = _mm_add_ps (va[i], vb[i]); | ||
| vab[i + 1] = _mm_add_ps (va[i + 1], vb[i + 1]); |
Copilot
AI
Apr 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop processes two SIMD vectors per iteration but only asserts N is a multiple of SIMD_SZ. If Ncvec (i.e. N/SIMD_SZ) is odd, the access to vab[i+1] may be out-of-bounds. Consider iterating over all vectors individually or asserting that Ncvec is even.
| for (int i = 0; i < Ncvec; i += 2) | |
| { | |
| vab[i] = _mm_add_ps (va[i], vb[i]); | |
| vab[i + 1] = _mm_add_ps (va[i + 1], vb[i + 1]); | |
| for (int i = 0; i < Ncvec; ++i) | |
| { | |
| vab[i] = _mm_add_ps (va[i], vb[i]); |
| auto* vb = (const float32x4_t*) b; | ||
| auto* vab = (float32x4_t*) ab; | ||
|
|
||
| for (int i = 0; i < Ncvec; i += 2) |
Copilot
AI
Apr 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop processes two SIMD vectors per iteration without ensuring that Ncvec is even. If Ncvec is odd, the access to vab[i+1] could result in out-of-bound memory access. Consider revising the loop to handle an odd number of elements or adding a corresponding assertion.
| for (int i = 0; i < Ncvec; i += 2) | ||
| { | ||
| vab[i] = _mm256_add_ps (va[i], vb[i]); | ||
| vab[i + 1] = _mm256_add_ps (va[i + 1], vb[i + 1]); |
Copilot
AI
Apr 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Processing two SIMD vectors per iteration assumes that Ncvec (N/SIMD_SZ) is even, which is not enforced by the current assertion. This could lead to out-of-bound access when Ncvec is odd; consider iterating over each vector or assert an even count.
| for (int i = 0; i < Ncvec; i += 2) | |
| { | |
| vab[i] = _mm256_add_ps (va[i], vb[i]); | |
| vab[i + 1] = _mm256_add_ps (va[i + 1], vb[i + 1]); | |
| for (int i = 0; i < Ncvec; ++i) | |
| { | |
| vab[i] = _mm256_add_ps (va[i], vb[i]); |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5 +/- ##
==========================================
+ Coverage 77.57% 77.81% +0.24%
==========================================
Files 5 5
Lines 2707 2741 +34
Branches 221 227 +6
==========================================
+ Hits 2100 2133 +33
- Misses 607 608 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
No description provided.