-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to avoid smuggling of data via TLS and scoped-tls and other means #484
Comments
If I understand correctly, the core of the problem is described well by @adamreichold:
Indeed Rust threads are currently tied to OS threads, and TLS makes that visible. What I don't understand is which benefit such an "as-if" thread would have. Regarding the concrete suggestion: threads don't really have any type system interactions. The Rust compiler doesn't even know what a thread is. The issue isn't the type system effects, the issue is the runtime effects, specifically, thread-local state. I think what you are asking is, can we "swap out" the thread ID and thread-local state of a thread temporarily, and then later swap the original state back in. From a purely theoretical perspective this is an entirely reasonable operation and I don't think it would be too hard to implement in Miri. What I don't know is if the way thread-local state works on real OSes makes it possible to have a mechanism like that. I think Rust might be fundamentally restricted here by the APIs that platforms provide. I will note that even without |
Reposting here in a suitably embarrassed state to atone for messing up the list issue:
I think what the PyO3 discussion shows is that when libraries venture into language territory - as PyO3's handling of Python's heap, GIL and associated invariants seems to do - a way to exert more control over the boundaries which make up a thread would be really helpful. For example, something like an "as-if" version of Admittedly, this is also probably not the right issue to discuss this, but I just wanted to highlight that this is not just a unsafe library X versus Y problem, but a more fundamental question of how libraries can put the type system notions currently bound to operating system threads to their own uses. So maybe a rather long-winded +1 to the parent comment...
Note that we are currently leaning towards a solution of both thread-identity-based soundness loopholes by trying to avoid the abuse, i.e. by actually starting a new thread which is admittedly a bit hilarious, c.f. PyO3/pyo3#3646 |
No worries, I think this was indeed inevitable, the comment box at the bottom is too tempting. I should probably have made this a wiki page or a markdown file but those are more annoying to maintain... |
Our current solution to both soundness loopholes works by actually running the closure passed to So such an "as-if" spawn would allow me to implement that API without the cost of using actual threads because I do not need parallelism, I just want any state tied to the current thread to be unavailable for the duration of the closure. I have not figured out whether this would also help other situations like async runtimes which want to use non-
First, yes, I want to run some closure as if it was executed on a freshly started OS thread without having to pay the cost of actually having to start one. This is what I mean by an "as-if" version of What I meant by the somewhat grandiose "type system interactions" is also basically what you said: TLS and thread ID are the two most obvious ways in which libraries are currently "sabotaged" when they try to create such boundaries because these can be used to "smuggle" data through these boundaries safely. It would be great if there were ways to plug these loopholes at least in such a manner that it would require unsafe code so that libraries can at least fall to the unsafe library X versus Y situation. Not sure if we can get better than that due to interactions with lower level OS facilities, see our old friend
Indeed, we ran into the |
Maybe to add: I don't think just using a different auto trait for each of these boundaries will work (for the current PyO3 design it would definitely not work due to |
Yeah, I get that. Thinking about what Send/Sync mean logically in my formal model of the Rust type system, I think the point is you want some form of "execution context" where Rust ensures that types cannot, in general, be moved between "execution contexts". These contexts could be entirely fictional in nature, not corresponding to anything that happens on the real machine. (In my formal model, we do have exactly one such execution context per physical thread. But there's actually nothing forcing us to do that. We then defined Send/Sync wrt to these execution contexts.) With things like thread_local, it is fairly clear that in Rust, Send and Sync are tried to threads, not this broader idea of an "execution context". That doesn't fundamentally stop us from introducing a concept of abstract execution contexts, but it means we'd need a new set of auto traits, e.g.
This is all theoretically coherent, but the problem is it's a massive breaking change. I think all this gives me a better understanding of the underlying issue, but I think this is only solvable in two ways (even if we had a time machine):
So... it's not looking good. :/ You are asking for a rather big language change, and even with the benefit of hindsight it's unclear if Rust 1.0 could have reasonably done any better. |
I think the FFI angle is actually the biggest problem. Because everything else is already done as part of creating threads, whether in the kernel or in the language runtimes. Whether it is actually so much cheaper as to be worth the hassle is of course an interesting follow-up question. This also somewhat reminds me of the (Also, there were user mode threading libraries on Linux before the whole NPTL thing. Not sure how well that worked without integration with the C library though...) (Concerning the async use case, while always a new thread might avoid smuggling data, I guess the even better thing would be to "reify" the thread state so that the runtimes could pass it around with the tasks and install it before polling them. This would make thread identity fully programmable. But indeed, a rather hard problem without kernel co-evolution to make it happen at the lowest level.) EDIT: Just to add: Yes, personally, I very much do not want to add |
I am really torn now whether I should feel bad because I am advocating for setjmp/longjmp on the thread level or feel nice because I am want the Linux kernel to grow a call/cc. 😺 |
Can you link to a description of why you even care about this kind of data smuggling in the first place? This does strongly feel like an XY problem, ultimately what I think you really want is to reason about the "capability" of having acquired the GIL, or something like that. You are just currently implementing that "capability" in a particular way and then there's problems with that implementation. If Rust had more native support for such "capabilities" then maybe you wouldn't need to worry about data smuggling? |
See also my post here. |
I don’t think that would necessarily be all that bad, if we had a time-machine. |
I think the best write-ups of the status quo for PyO3 are https://pyo3.rs/v0.20.0/types and the documentations of And yes, underlying this is most likely a "lack of capabilities" kind of issue, i.e. we have reified holding the GIL into a zero-sized token type So as @davidhewitt mentioned over at PyO3, if we had language support for "holding the GIL" as a context/capability so that user code would not have to pass the GIL token manually, this problem would likely go away or at least look very different. So in that sense, I agree with the description as an XY problem. But then again, in the long-term, the GIL is apparently on its way out and maybe we should not cater to it too much. And the async runtime use case probably does need more than just context/capability support because it is not just a syntactical problem there. Finally, I think reifying implicit context and exposing it to user code is often useful beyond its immediate applications, so that in summary I think reifying thread identity in Rust is something worth discussing even if PyO3 didn't exist. |
I think this fits well with how @RalfJung is discussing it and how this issue is named. I personally am probably looking for "thread context as a value" or "first class threads", i.e. making the underlying OS service more programmable rather than adding more language to create new kinds of abstractions. (Rust getting even more complex, oh my...) Of course, wanting something does not make it sensible or feasible or actually implements it, so I guess I'll have to take my |
So the idea I posted above has indeed been posted twice before in the last few days; in your post and here. Interesting. :D
I am not sure the lang team would have accepted the cost of two more auto traits. But I guess that's kind of pointless speculation anyway.
Well you are asking for it to get even more complex. ;) You are just starting from the operational semantics side and Seffahn is starting from the type system side. You are talking about the same thing though. All that effect system business comes up because they are trying to retrofit this into an existing language. That only takes care of TLS though, not thread IDs. However, I had a look at the docs, and actually the Rust thread ID is already not the OS thread ID. It's a unique ID generated with a global Rust-private counter. So we could totally say that each "execution context"/"capsule" gets its own Thread ID. If we tie that together with the ability to say "this code does not use tls", then this entire thing could be done without new auto traits -- using the existing Send/Sync traits to bound the closure running in the "as-if" thread, giving that "thread" a new thread ID, and not letting it access TLS at all. (Though not being able to access TLS might be a serious limitation.) We'd have to declare that one may not use OS-level thread identifiers to make conclusions about |
Not a coincidence that it was "twice" - the IRLO thread was a comment thread to madklad's post. Even though I've described mostly ideas I've had for myself a few months ago, I've expressed them there now because it seemed like a highly similar / related idea. I had also opened the new |
I am not sure. In my little head, the language does not change at all and I just get a function to replace the current thread ID/TLS block, putting it behind an opaque handle and atomically installing a new thread ID/TLS block on the current OS thread, something like
In that thread-identity-can-be-replaced world, I think one would not need to prohibit TLS as its scope could be controlled, e.g. the TLS handle could be installed by the async runtime to follow a task as migrates between OS threads. Or PyO3 could always install a fresh one to close any unwanted side channels. By the way, the |
This looks somewhat related as well: rust-lang/rust#117873 |
(For the async runtime use case, I think this could even become an optional effort/optimization: |
Ah, I see. I think Steffahn was working under the assumption that that's not possible with reasonable performance. This is waay outside my area of expertise so I can't really say whether it is realistic or not. Remember this needs to be supported on all platforms Rust compiles to. |
Having had time to read that now, I think what I am trying to propose is also hinted at in question 3 of @matklad's https://matklad.github.io/2023/12/10/nsfw.html#Four-Questions |
in the Time Machine category, I think there’s a third self-consistent way:
That’s drastic, but, given that thread locals are a niche feature (there are like entire languages without thread locals), and that “fixing” Send removes the need to track sendness of futures and thus the need to have return type notation and such, I would be ready to vehemently argue for this setup in 2014! Will tackle that right after the Hawking party, if I ever come by a Time Machine! |
For Rust to be a viable alternative to C or C++, it absolutely must support TLS. Even rustc itself uses it. So that is not an option I think.
|
A bare bones |
That would not have helped, you can implement "lazy TLS" on top of barebones TLS. You need something slightly more tricky anyway to avoid references outliving the current thread.
|
Ah you need destructors yes. But you do need those I think, after all TLS gets deallocated. Regular statics avoid that issue by virtue of never being deallocated.
|
I’d say that’s an argument in favor of bare-bones unsafe, but zero cost thread locals ^^ Rust thread_local macro is safe, but it comes with runtime overhead. Up until very recently (when we added const-initialization to thread locals) it was impossible to implement a performant global allocator in Rust, because, for an allocator, you need fast TLS: https://matklad.github.io/2020/10/03/fast-thread-locals-in-rust.html |
Even if they are unsafe they need a reasonable safety contract. So I don't think that would have helped, we would not have included |
A slight alternative to the Adding these traits would only break crates that rely on thread local dynamic types since all existing concrete (non-dynamic) types would already implement Another hacky idea that doesn't rely on language changes would be to lean into the idea that non I also thought of more use cases for
#[repr(transparent)]
pub struct ScopeCell<T: CtxSend>(UnsafeCell<T>, PhantomNotCtxSync);
impl<T: CtxSend> ScopeCell<T> {
pub fn new(t: T) -> Self {
ScopeCell(UnsafeCell::new(t), PhantomCapability)
}
pub fn with_mut<U>(&self, f: impl FnOnce(&mut T) -> U + CtxSend) -> U {
// Safety:
// the closure cannot capture a copy of `self` since it implements `CtxSend`, but `&Self` doesn't
// `self` cannot contain a self reference since `T` implements `CtxSend` but `&Self` doesn't
// a copy of `self` cannot be taken from thread local memory since `&Self` doesn't implement `CtxSend`
//
// Since it is impossible for the call to `f` to access a copy of `self`,
// it can't call `with_mut` again to get an aliasing mutable reference
unsafe{f(&mut *self.0.get())}
}
} Are there any other reasons this would be unsound? |
It seems like there are several cases where APIs would be sound if Rust didn't have thread-local state. scoped-tls expands on that by allowing non-
'static
types to be stored in TLS.The text was updated successfully, but these errors were encountered: