Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use PyList.get_item_unchecked() on free-threaded build #4539

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ngoldbaum
Copy link
Contributor

Followup for #4410.

get_item_unchecked allows possible access of dangling pointers and other data races because PyList_GET_ITEM returns a borrowed reference. I added bindings for PyList_GetItemRef in #4410 but missed that the liter iterator uses get_item_unchecked.

I could leave the APIs visible since they're already marked as unsafe, but since this API is already cfg-ed out for the limited API I thought it might make sense to disable it for free-threaded build as well.

Will add a release note if we decide disabling it is the correct thing to do.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, after a bit of examination here I think that disabling this API is the right approach for now.

There's no way to guarantee that the list length remains valid from time of check to time of calling .get_item_unchecked . So the safety invariant cannot realistically be met.

Comment on lines +468 to +470
#[cfg(any(Py_LIMITED_API, PyPy, Py_GIL_DISABLED))]
let item = self.list.get_item(index).expect("list.get failed");
#[cfg(not(any(Py_LIMITED_API, PyPy)))]
#[cfg(not(any(Py_LIMITED_API, PyPy, Py_GIL_DISABLED)))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's probably also a question of concurrent additions (or removals) to the list during iteration; at the moment we can rely on re-checking the length on each call to .next() and the guarantee of the GIL to stop us doing an out-of-bounds read. With the freethreaded build, I assume it's possible to have a time-of-check to time-of-use error between length and .get_item() call here.

It seems possible that we can have panics on the freethreaded build from the "list.get failed" where they should be rarer (impossible?) on the GIL build.

I wonder, what happens on the freethreaded build if a list is modified (in another thread) during iteration in pure-python?

Should we use list.tp_iter rather than defining our own iterator? (i.e. are we back to similar questions as in #4439?)

Copy link
Contributor Author

@ngoldbaum ngoldbaum Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, what happens on the freethreaded build if a list is modified (in another thread) during iteration in pure-python?

Then you get a race condition. Try yourself with this script:

import threading

lis = [_ for _ in range(3)]
b = threading.Barrier(2)

def func1():
    b.wait()
    lis[0] = 7

def func2():
    waited = False
    for i in lis:
        if not waited:
            waited = True
            b.wait()
	print(lis)

threads = [threading.Thread(target=func1), threading.Thread(target=func2)]

[t.start() for t in threads]
[t.join() for t in threads]

There's also a race with the GIL, but I don't think you'll ever see this script print out

[7, 1, 2, 3]
[7, 1, 2, 3]
[7, 1, 2, 3]
[7, 1, 2, 3]

with the GIL enabled. With the GIL enabled you don't know when the next time the other thread will be able to run, so you might see

[0, 1, 2, 3]
[7, 1, 2, 3]
[7, 1, 2, 3]
[7, 1, 2, 3]

but also you might see

[0, 1, 2, 3]
[0, 1, 2, 3]
[0, 1, 2, 3]
[0, 1, 2, 3]

but you'll never see the change reflected in the first iteration like you can see in the free-threaded build.

The same is also true about getting random results on both builds (but not crashing) if you replace the list.setitem with e.g. lis.append(4). So I think it definitely is possible for the list size to change "underneath" a thread, on both builds if both threads release the GIL, but all the time on the free-threaded build. So you're right, runtime panics might be more likely.

Should we do anything special to account for that possibility?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants