Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use PyList.get_item_unchecked() on free-threaded build #4539

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions src/types/list.rs
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ pub trait PyListMethods<'py>: crate::sealed::Sealed {
/// # Safety
///
/// Caller must verify that the index is within the bounds of the list.
#[cfg(not(Py_LIMITED_API))]
#[cfg(not(any(Py_LIMITED_API, Py_GIL_DISABLED)))]
unsafe fn get_item_unchecked(&self, index: usize) -> Bound<'py, PyAny>;

/// Takes the slice `self[low:high]` and returns it as a new list.
Expand Down Expand Up @@ -298,7 +298,7 @@ impl<'py> PyListMethods<'py> for Bound<'py, PyList> {
/// # Safety
///
/// Caller must verify that the index is within the bounds of the list.
#[cfg(not(Py_LIMITED_API))]
#[cfg(not(any(Py_LIMITED_API, Py_GIL_DISABLED)))]
unsafe fn get_item_unchecked(&self, index: usize) -> Bound<'py, PyAny> {
// PyList_GET_ITEM return borrowed ptr; must make owned for safety (see #890).
ffi::PyList_GET_ITEM(self.as_ptr(), index as Py_ssize_t)
Expand Down Expand Up @@ -465,9 +465,9 @@ impl<'py> BoundListIterator<'py> {
}

unsafe fn get_item(&self, index: usize) -> Bound<'py, PyAny> {
#[cfg(any(Py_LIMITED_API, PyPy))]
#[cfg(any(Py_LIMITED_API, PyPy, Py_GIL_DISABLED))]
let item = self.list.get_item(index).expect("list.get failed");
#[cfg(not(any(Py_LIMITED_API, PyPy)))]
#[cfg(not(any(Py_LIMITED_API, PyPy, Py_GIL_DISABLED)))]
Comment on lines +468 to +470
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's probably also a question of concurrent additions (or removals) to the list during iteration; at the moment we can rely on re-checking the length on each call to .next() and the guarantee of the GIL to stop us doing an out-of-bounds read. With the freethreaded build, I assume it's possible to have a time-of-check to time-of-use error between length and .get_item() call here.

It seems possible that we can have panics on the freethreaded build from the "list.get failed" where they should be rarer (impossible?) on the GIL build.

I wonder, what happens on the freethreaded build if a list is modified (in another thread) during iteration in pure-python?

Should we use list.tp_iter rather than defining our own iterator? (i.e. are we back to similar questions as in #4439?)

Copy link
Contributor Author

@ngoldbaum ngoldbaum Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, what happens on the freethreaded build if a list is modified (in another thread) during iteration in pure-python?

Then you get a race condition. Try yourself with this script:

import threading

lis = [_ for _ in range(3)]
b = threading.Barrier(2)

def func1():
    b.wait()
    lis[0] = 7

def func2():
    waited = False
    for i in lis:
        if not waited:
            waited = True
            b.wait()
	print(lis)

threads = [threading.Thread(target=func1), threading.Thread(target=func2)]

[t.start() for t in threads]
[t.join() for t in threads]

There's also a race with the GIL, but I don't think you'll ever see this script print out

[7, 1, 2, 3]
[7, 1, 2, 3]
[7, 1, 2, 3]
[7, 1, 2, 3]

with the GIL enabled. With the GIL enabled you don't know when the next time the other thread will be able to run, so you might see

[0, 1, 2, 3]
[7, 1, 2, 3]
[7, 1, 2, 3]
[7, 1, 2, 3]

but also you might see

[0, 1, 2, 3]
[0, 1, 2, 3]
[0, 1, 2, 3]
[0, 1, 2, 3]

but you'll never see the change reflected in the first iteration like you can see in the free-threaded build.

The same is also true about getting random results on both builds (but not crashing) if you replace the list.setitem with e.g. lis.append(4). So I think it definitely is possible for the list size to change "underneath" a thread, on both builds if both threads release the GIL, but all the time on the free-threaded build. So you're right, runtime panics might be more likely.

Should we do anything special to account for that possibility?

let item = self.list.get_item_unchecked(index);
item
}
Expand Down Expand Up @@ -863,7 +863,7 @@ mod tests {
});
}

#[cfg(not(any(Py_LIMITED_API, PyPy)))]
#[cfg(not(any(Py_LIMITED_API, PyPy, Py_GIL_DISABLED)))]
#[test]
fn test_list_get_item_unchecked_sanity() {
Python::with_gil(|py| {
Expand Down
Loading