-
-
Notifications
You must be signed in to change notification settings - Fork 33.8k
gh-142884: Fix UAF in array.array.tofile with concurrent mutations
#143238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
It is not the first time you are re-creating a PR because a force-push was wrong. Please, in the future, just don't force push. If you want to update your branch, hit the "update branch" button and pull your changes. |
array.array.tofile with concurrent mutations
after i rebase upstream/main,there are a lot of commits,so i close the old one,then create a new one |
picnixz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all your PRs, there are always unrelated changes. This makes me lose time in the reviews a lot so PLEASE, avoid this in the future:
- Never remove comments unless they become outdated.
- Do not change unrelated code or try to change code that is not directly used.
After a better look at the code, I'm sorry to tell you to use the while-loop approach because I didn't think about an array that could grow inside the writer (although it could give an infinite loop, I don't think it makes sense to prevent it). You'll need a test for that as well (independently of whether you use a for or a while loop approach).
Misc/NEWS.d/next/Library/2025-12-18-11-41-37.gh-issue-142884.kjgukd.rst
Outdated
Show resolved
Hide resolved
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Don't rebase upstream/main, just merge main into your branch. |
ok |
| def test_tofile_concurrent_mutation(self): | ||
| # Keep this test in sync with the implementation in | ||
| # Modules/arraymodule.c:array_array_tofile_impl() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def test_tofile_concurrent_mutation(self): | |
| # Keep this test in sync with the implementation in | |
| # Modules/arraymodule.c:array_array_tofile_impl() | |
| def test_tofile_concurrent_mutation(self): | |
| # Prevent crash when a writer concurrently mutates the array. | |
| # See https://github.com/python/cpython/issues/142884. | |
| # Keep 'BLOCKSIZE' in sync with the array.tofile() C implementation. |
| cleared = False | ||
| def write(self, data): | ||
| if not self.cleared: | ||
| self.cleared = True | ||
| victim.clear() | ||
| return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| cleared = False | |
| def write(self, data): | |
| if not self.cleared: | |
| self.cleared = True | |
| victim.clear() | |
| return 0 | |
| def write(self, data): | |
| victim.clear() | |
| return 0 |
We actually don't care about calling clear() multiple times.
| victim.clear() | ||
| return 0 | ||
|
|
||
| victim.tofile(Writer()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| victim.tofile(Writer()) | |
| victim.tofile(Writer()) # should not crash |
| } | ||
|
|
||
| if (total_size > max_items) { | ||
| return PyErr_NoMemory(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why raise MemoryError? just break the loop. I think you can just do while (total_size <= max_items) as max_items doesn't change. This would simplify the loop.
| Py_ssize_t offset = 0; | ||
| while (1) { | ||
| Py_ssize_t total_size = Py_SIZE(self); | ||
| if (self->ob_item == NULL || total_size == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather check nullity of ob_item at the end, or include it in the loop condition as well.
| break; | ||
| } | ||
|
|
||
| Py_ssize_t size = current_nbytes - offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer reverting this computation, that is, size = BLOCKSIZE and then cut it to what remains.
the original code precomputed nblocks at the beginning of the function, but when a reentrant
writer cleared the array during the first callback, self->ob_item became NULL. The loop continued iterating based on the cached values and dereferenced the null pointer.
array.array.tofilevia reentrant writer #142884