-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
try_init
breaks assumptions about pythonic module imports
#29
Comments
Thanks for bringing this to my attention! I've been thinking about some of these problems in #26, but you've also brought up some more considerations that I wasn't aware of.
According to the docs, this will get the event loop associated with the current os thread, so I assume it would work with event loops other than the default (main thread) event loop. The problem with
I think the only way to account for this single event loop assumption is by calling
Yeah, this is a problem. I think the
My interpretation of those docs was that it would work on any thread, the exception being that it would not attempt to initialize the event loop unless the calling thread was also the main thread. One problem is that we can't rely on it to get the relevant event loop from a Rust thread, so we'll most likely have to add a So in short, I think we need to make the following changes:
Open questions:
|
Thank you for leaving a candid response, sorry if my tone was otherwise harsh or rude.
Yes, as this problem is mainly focused on the python side effects, initialisation limited to a module's "boundries" during import happens all the time, unless it starts actively running work in the background that steals GIL or CPU cycles on idle (which can be frowned upon), this is entirely fine (as far as python is concerned).
I'd say that some caching to keep performance is in order, but once an edge-case situation is detected (multiple event loops in this case), the library should enter "bailout" mode, where it'll regress to a less performant (but more correct) state.
Maybe not, it'll help locality maybe better to provide some way to reference the python event loop via a "context" variable of sorts, maybe that is not exactly what you want, but for correctness' sake, it could be included. With the way asyncio works, you can technically never be sure that, at the time of calling a function to be converted into a coroutine, you can re-use an old event loop. I think that for the sake of sanity though, at a time that any coroutine per the originating event loop's thread will share the same event loop, that event loop is assumed to not have dropped. So containing a This is sound, as the top-level call on a thread containing a new event loop will be I don't know what to do with the rust side of things once an event loop finishes, some investigation needs to be done into that. I expect UB, but unless this affects how rust must respond after a coroutine gets dropped in general (so when an event loop gets stopped long before the program exits), I don't see this being a major issue. I'll be willing to discuss this further on gitter/matrix whenever you have the time though, as i know some details can get lost inbetween the cracks, and i would like to work on making this more correct. |
No worries, I didn't get that impression. This library kinda spawned out of my use-case which was a Rust application using some async Python. It turns out the reverse is a bit more complicated, but arguably more important to the community, so I'm always happy to get feedback from this perspective. As for the implementation, the caching mechanism has the potential to get really complicated, and I'm not entirely sure which parts are going to be the most important to optimize. I think we should focus on correctness first, make some benchmarks for the conversions, then target the most problematic areas for performance. It might also be a good idea to outline some of the use-cases we think are going to be most common so we can be sure our design accounts for them. For example, custom event loops, multiple event loops, multiple native modules, etc.
I'd like to keep most of the macro for the design / implementation either in this thread or #26. That way if anyone else is searching for this or has some thoughts on it they can read up on the rationale / progress and join in on the convo whenever. We can use gitter/matrix/discord to chat about the details whenever though. Whatever works best! |
That sounds good, the "correct" way here would be to run
The problems and solutions i arose in this issue addresses the first two, but i dont think there's a good solution for the latter. Some context: a while back i tried to do this exact thing, try to get rust async working from python async, in some manner similar to However, that'll also mean that if two native extensions try to do the same thing, by running an event loop aside from python, they'll each be spawning its own event loop, from its own compiled code, effectively creating duplication at runtime and at compile time. This is unavoidable, ive decided, as you basically cannot guarantee that the code compiled by extension A will be compatible with extension B, minor version or compiler differences can prop up, and I don't think the effort of trying to match that and effectively using another dynamic library's async executor is a good idea. Much UB can arise out of it, and i think it's easier (and safer) to just isolate each extension from eachother, both at runtime and at compile time. This can result in some pressure on making the runtimes smaller, runtimes like smol can address that, although i have no idea if it's usage is as simple as dropping it in in this library.
Thanks, I'm currently sitting in the PyO3 lobby on the matrix side as |
Ah ok, I wondered if this would be an issue back when I was writing up #26. Maybe this'll be possible sometime in the future when async runtimes are more mature and Rust has a relatively stable ABI. In the meantime, this caveat might deserve a place in the docs.
The traits in
No problem, school comes first! I probably won't have much time to work on it this weekend, but I'll try to get started on some of these changes at some point next week. I'll post updates as I go, but there's really no deadline for this, so don't worry about responding or contributing unless you know you have the time. |
I think with #30 merged this issue can be closed. We can open a follow-up issue at some point in the near future. I know you and I both probably have some thoughts on how to improve this library further, but I kinda want to take a step back for a few weeks to wait and see if there are any unanticipated problems that users start running into with the |
Thanks so much! |
It induces side effects, this is a huge no-no for python;
get_event_loop
, capturing a reference to a global scopePython analysis tools strongly push to not intermix module imports and sequential code, the wide consensus is that no side effects during import must take place,
try_init
breaks this rule by assuming the application it is in will only have a single event loop, and that it will only have the event loop that is the original as reported byasyncio
on import.I recommend changing the code to at least only consider
get_event_loop()
at the time of calling asyncio-specific behaviour, no assumptions must be made inbetween asyncio-related calls on threads, same and different, that the asyncio event loop remains the same.Additionally, according to the asyncio docs,
get_event_loop
only has behaviour when imported on the main thread, I can't confirm it, but I have a suspicion it'll break when it is called from another thread (and no event loop exists).The text was updated successfully, but these errors were encountered: