Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtle issue with borrowed references in extensions. #95797

Closed
anamax22 opened this issue Aug 8, 2022 · 18 comments
Closed

Subtle issue with borrowed references in extensions. #95797

anamax22 opened this issue Aug 8, 2022 · 18 comments
Labels
docs Documentation in the Doc dir topic-C-API

Comments

@anamax22
Copy link

anamax22 commented Aug 8, 2022

I think that the discussion of borrowing in Section 1.10 makes an untrue assumption.

In specific, I think that "When you pass an object reference into another function, in general, the function borrows the reference from you" is incomplete/misleading.

More to the point, "When a C function is called from Python, it borrows references to its arguments from the caller. The caller owns a reference to the object, so the borrowed reference’s lifetime is guaranteed until the function returns." seems wrong.

I don't have a clean documentation fix but I do have an example that demonstrates the problem.

Section 1.10.3 of https://docs.python.org/3/extending/extending.html ends with:

void
bug(PyObject *list)
{
    PyObject *item = PyList_GetItem(list, 0);
    Py_BEGIN_ALLOW_THREADS
    ...some blocking I/O call...
    Py_END_ALLOW_THREADS
    PyObject_Print(item, stdout, 0); /* BUG! */
}

It's a second example of "Could it perhaps do something to invalidate the reference to item in bug()? You bet! Assuming that the list passed into bug() is accessible to the del() method, it could execute a statement to the effect of del list[0], and assuming this was the last reference to that object, it would free the memory associated with it, thereby invalidating item." (from earlier in that section).

However, I'm reasonably certain that the following is also broken in a way that the discussion of borrowing (from the caller) doesn't address.

void
bug2(PyObject *list)
{
    Py_BEGIN_ALLOW_THREADS
    ...let other threads run during some blocking I/O call...
    Py_END_ALLOW_THREADS

    # Can the last reference to list be deleted by a thread that ran during
    # said blocking I/O call?
    # If so, this next line is likely to cause problems.
    PyObject *item = PyList_GetItem(list, 0);
}
@anamax22 anamax22 added the docs Documentation in the Doc dir label Aug 8, 2022
@ronaldoussoren
Copy link
Contributor

The issue you sketch is a reverse scenario from the documentation you quote.

The sentence When a C function is called from Python, it borrows references to its arguments from the caller. The caller owns a reference to the object, so the borrowed reference’s lifetime is guaranteed until the function returns. is about this:

PyObject* value = PyLong_FromLong(42); // *value* is a owned reference

SomeAPI(value);

Here SomeAPI "borrows" the reference from its caller, in that it assumes that the caller has an owning reference.

In value = PyList_GetItem(someList, 42) the value is a borrowed reference returned from an API, and that indeed has the problems you sketch: Some other location owns the reference (in this case the list object) and that reference get invalidated when you call back into the interpreter (which might resize the list or replace this item).

Note that the text, and other examples, in 1.10.3 more clearly explain the problem then I do here.

@erlend-aasland erlend-aasland added the pending The issue will be closed if no feedback is provided label Aug 8, 2022
@anamax22
Copy link
Author

anamax22 commented Aug 9, 2022

The examples in 1.10.3 are about list's elements. They point out that while list was borrowed, its elements weren't.

My question is whether the caller's reference to list is guaranteed to survive running other threads.

If not, "the borrowed reference’s lifetime is guaranteed until the function returns" is not true.

I don't see any text in 1.10 that addresses the possibility that a borrowed reference can go away, that its lifetime actually isn't guaranteed.

@encukou
Copy link
Member

encukou commented Aug 9, 2022

You are correct. Functions that return borrowed references are tricky to use correctly: you need to ensure that whatever you're borrowing from stays valid, and that its reference stays valid (which might involve relying on implementation details).
You don't even need threads for things to go wrong: even printing out a value can call custom __str__, which can clear a list and invalidate a borrowed reference.

Functions that return borrowed references have been available for a long time, are very widely used, and enable faster code than safer variants, so they're not going away. But we can definitely do better in documenting the issues and giving safer alternatives.

Things are different when getting a borrowed reference directly as a function argument (list, not item, in your example). There, the caller is responsible for keeping the reference valid for the entire call, and that is generally not hard to do.

@erlend-aasland erlend-aasland removed the pending The issue will be closed if no feedback is provided label Aug 9, 2022
@anamax22
Copy link
Author

anamax22 commented Aug 9, 2022

There seems to be some confusion. I'm not talking about functions that return borrowed references. I'm talking about a case when borrowing doesn't work.

Consider the following:

# change() is called continuously by another thread
_GLOBAL = list('first')
def change():
    global _GLOBAL
    _GLOBAL = list('new')

def caller():
    called(_GLOBAL)

while True:
    caller()

Everything is safe if called is implemented in python because initializing the parameter creates a new reference, even if the python called lets other threads run before accessing the said parameter's value.

However, if called is an extension function, its borrowed reference is probably invalid after it lets other threads run.

I think that the only safe thing for extension functions is to create a new reference to all parameters before letting other threads run.

Yes, another way to solve this problem is to implement caller as:

def caller():
    temp = _GLOBAL
    called(temp)

But, that assumes that caller knows a lot about called's implementation.

@encukou
Copy link
Member

encukou commented Aug 9, 2022

The safe thing is to create a new reference after calling PyList_GetItem (a function that is problematic because it returns a borrowed reference), or to instead call PySequence_GetItem (which returns a new reference). It's then necessary to decref the item when the called function no longer needs it.
Creating references to all arguments is not necessary.

@anamax22
Copy link
Author

anamax22 commented Aug 9, 2022

I'll it say again - I'm not talking about the elements/components of an argument. I'm talking about the argument as a whole, which is the actual borrowed reference.

The same problem happens with floating point numbers.

_GLOBAL = 100.9
def changer():
    global _GLOBAL
    _GLOBAL = 1.1+ _GLOBAL

arrange to have "changer" called continuously by a thread.

while True:
    reader(_GLOBAL)

If "reader" is written in python, everything is okay even if "reader" lets other threads, and thus "changer", run.

If "reader" is an extension, the borrowed reference to _GLOBAL's value is invalid after "changer" is called while "reader" lets other threads run.

@ronaldoussoren
Copy link
Contributor

I think we all agree here: If you get a borrowed reference as the result of calling a function you need to be very careful when using that value. In general you'll have to incref before using it with other APIs, and decref afterwards (although I've seen a lot of code that doesn't do this because the C extension is only used in a context where this is not a problem).

That's explained at the start of 1.10.3, in an informal way.

Is there anything actionable to resolve this issue?

@anamax22
Copy link
Author

Minimum:
"The caller owns a reference to the object, so the borrowed reference’s lifetime is guaranteed until the function returns. Only when such a borrowed reference must be stored or passed on, it must be turned into an owned reference by calling Py_INCREF()."

to something like.

"A borrowed reference lives only as long as the caller's reference. (That reference can go away if other threads are allowed to run.) In addition, a borrowed reference must be turned into an owned reference by calling Py_INCREF() if it is stored or passed on.)"

I don't know whether it is worth including my last example.

@encukou
Copy link
Member

encukou commented Aug 10, 2022

The borrowed reference that's passed as an argument must not go away until the called function returns, regardless of other threads. Since the caller must hold a reference, other threads cannot delete the last reference.
If the caller somehow allows other threads to invalidate its reference, it's a bug in the caller.

Perhaps the misunderstanding comes from the fact that a reference's “borrowed-ness” depends on the context? An argument is a borrowed reference in the called function, but in the caller it's not (necessarily) a borrowed reference?

@ronaldoussoren
Copy link
Contributor

"borrowed" primarily means that you don't have to decref when you're done with the reference, while you do have to decref for an "owned" reference. For arguments there's the additional convention that there should be a owned reference somewhere up the call stack (which means the callee can assume the reference stays valid until it returns).

@anamax22
Copy link
Author

I'm surprised by the assertion that the caller is responsible for knowing when "callee(_GLOBAL)" is safe.

If callee is written in python, it is safe.

If callee is an extension, it seems reasonable to expect its implementation to provide the same guarantee.

@encukou
Copy link
Member

encukou commented Aug 15, 2022

The caller doesn't need to know anything about callee. It does need to ensure that the reference that it passes to callee remains valid until callee returns.

Maybe misunderstanding about what the function would look like when "written in C"? I assume the function:

def caller():
    called(_GLOBAL)

should be written in C as something like:

PyObject* caller(PyObject* module)
{
    PyObject* global = PyObject_GetAttrString(module, "_GLOBAL");
    if (!global) return NULL;
    PyObject* result = called(global);
    Py_DECREF(global);
    if (!result) return NULL;
    Py_RETURN_NONE;
}

or do you have something else in mind?

@anamax22
Copy link
Author

Please don't rewrite the caller. callee(1.0+35), callee(_GLOBAL), callee(local) should all work.

The caller doesn't need to know anything about callee. It does need to ensure that the reference that it passes to callee remains valid until callee returns.

The caller doesn't need to do anything to keep the reference alive if the callee is written in python, even if the callee lets other threads run.

The same should be true if the callee is written in C, and it is possible to write the callee in C such that that is true.

My point is that the documentation says something about borrowed references that suggests that the callee need not take certain precautions if it lets other threads run.

@encukou
Copy link
Member

encukou commented Aug 18, 2022

Please don't rewrite the caller

So, what is the C version of caller that we should be talking about? So far you only gave Python implementation, so to get a C implementation I need to rewrite it.

@ronaldoussoren
Copy link
Contributor

The Python implementation of the function does not have this problem because the bytecode interpreter uses owned references as needed.

My point is that the documentation says something about borrowed references that suggests that the callee need not take certain precautions if it lets other threads run.

The documentation seems to be clear about this (if a bit informal), but I write this as someone that already knows the rules by heart.

Note that threading is a red herring here, the problem is also there in serial code that can call back into Python, e.g.

PyObject* item = PyList_GetItem(someList, 4)
PyObject_Call(someFunction, ...)
PyObject_Print(item, stdout, 0);

If someFunction can access someList (because its in a variable or attribute) it can do del someList[4] and invalidate item.

That's something you have to be very careful about when writing C extensions. As @encukou wrote earlier you can side-step the issue here by using the abstract object item (PySequene_GetItem) instead of the list-specific API because that returns an owned reference.

@anamax22
Copy link
Author

[1] The caller's implementation is irrelevant because callees should work regardless of the caller. (That's true when the callee is written in python. It should also be true when the callee is an extension. To put it another way, an extension should be just as safe to call as python.)
[2] I switched to a float argument because the issue has nothing to do with items or other subcomponents of the borrowed argument. The caller's reference to the borrowed argument itself can go away in two circumstances. (I mentioned threading but ronaldoussoren points out that callbacks can have the same effect.) The documentation says that it can't.

@ronaldoussoren
Copy link
Contributor

This problem is documented In https://docs.python.org/3/extending/extending.html#thin-ice. This section does not contain an exhaustive list of scenario's where borrowed references can cause problems, but is (IMHO) pretty clear.

To get back to your initial report: In your "bug2" example there is in general not a problem because the convention is that the caller should ensure it owns a reference (or borrows a reference from its caller as long as there is a strong owner somewhere up the call chain). That's something we cannot enforce, so yes it is possible to do something like bug2(PyList_GetItem(list_of_lists, 0)); which could end up invalidating the reference while giving up the GIL. Don't do that...

If you are worried that callers of your C function might pass in a reference they got from calling a function that returns a borrowed reference you have to incref the value before giving up the GIL or calling a C API (and decref again before returning). Note that this is often not a problem, unless you want to be absolutely bullet proof. I've written enough in-house C extensions where I didn't worry about this issue because I knew enough of the rest of the system to know that the borrowed reference wouldn't be invalidated before I was done with it.

Bulletproof manual reference counting is a chore, which is why a lot of people currently advise to use Cython to write extensions, that way you get the write extensions in a Python-like language and don't have to worry about this issue yourself.

And a final note: it is well known that functions returning borrowed references are a (too) sharp edge in the C API, and in hindsight shouldn't have been in the public API (that latter part is my personal opinion, not necessarily the opinion of the project as a whole). See also #86460.

@iritkatriel
Copy link
Member

iritkatriel commented May 12, 2023

The discussion indicates that the issue reported it not clear.

@anamax22 can you show a problem with reference lifetimes in a running program, i.e., a c extension plus a multithreaded python program that clobbers the references to a c function call?

I am marking this as pending and it will be closed after a while if there will be no followup.

@iritkatriel iritkatriel added the pending The issue will be closed if no feedback is provided label May 12, 2023
@iritkatriel iritkatriel closed this as not planned Won't fix, can't repro, duplicate, stale May 31, 2023
@erlend-aasland erlend-aasland removed the pending The issue will be closed if no feedback is provided label Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir topic-C-API
Projects
None yet
Development

No branches or pull requests

5 participants