Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safe initialization of pinned structs #290

Closed
nbdd0121 opened this issue May 25, 2021 · 19 comments
Closed

Safe initialization of pinned structs #290

nbdd0121 opened this issue May 25, 2021 · 19 comments
Labels
• lib Related to the `rust/` library.

Comments

@nbdd0121
Copy link
Member

The kernel has a lot of data structures that cannot be freely moved, e.g. Mutex and CondVar. Currently what we do is to first unsafely create instance of them, pin them, and then unsafely initialize them after having them pinned (currently the code does not require unsafe to initialize Mutex, which I believe is a bug, as it could be initialized while locked, which is definitely unsound).

For example, this snippet (related #286) from miscdev needs 3 unsafes (5 if we count both inits):

struct SharedState {
state_changed: CondVar,
inner: Mutex<SharedStateInner>,
}
impl SharedState {
fn try_new() -> Result<Pin<Arc<Self>>> {
// SAFETY: `state` is pinning `Arc`, which implements `Unpin`.
let state = unsafe {
Pin::new_unchecked(Arc::try_new(Self {
// SAFETY: `condvar_init!` is called below.
state_changed: CondVar::new(),
// SAFETY: `mutex_init!` is called below.
inner: Mutex::new(SharedStateInner { token_count: 0 }),
})?)
};
// SAFETY: `state_changed` is pinned behind `Arc`.
let state_changed = unsafe { Pin::new_unchecked(&state.state_changed) };
kernel::condvar_init!(state_changed, "SharedState::state_changed");
// SAFETY: `inner` is pinned behind `Arc`.
let inner = unsafe { Pin::new_unchecked(&state.inner) };
kernel::mutex_init!(inner, "SharedState::inner");
Ok(state)
}
}

And this is a very common pattern. In the current state, essentially any use of CondVar, Mutex, Spinlock or any pinned types require a lot of unsafe to initialize. Of course we can have these types Box-ed internally but that's many unnecessary allocations.

So I spent the past few days designing and implementing a safe way to pin-initialized struct, pin-init (doc, repo). This design allows safe initialization and use of pthread mutex without any boxing: https://github.com/nbdd0121/pin-init/blob/trunk/examples/pthread_mutex.rs.

Kernel CondVar & Mutex could be implemented in a similar way. With pin-init, the above snippet could be written like this:

#[pin_init]
struct SharedState {
    #[pin]
    state_changed: CondVar,
    #[pin]
    inner: Mutex<SharedStateInner>,
}

impl SharedState {
    fn try_new() -> Result<Pin<Arc<Self>>> {
        try_new_arc(init_pin!(Self {
            state_changed: |s| kernel::condvar_new!(s, "SharedState::state_changed"),
            inner: |s| kernel::mutex_new!(s, "SharedState::inner", SharedStateInner { token_count: 0 }),
        }))
    }
}

Approaches like this make code more readable and safer; but it requires procedural macro (and parsing capability of syn) so it might not be easy to integrate at current stage.

@ojeda ojeda added prio: normal • lib Related to the `rust/` library. labels May 25, 2021
@ojeda
Copy link
Member

ojeda commented May 25, 2021

Related rust-lang/rust#85579 by @alex.

@nbdd0121
Copy link
Member Author

Actually not quite so related :) pin-init ensures that an object is pinned even before is constructed -- with Arc::try_pin, you only get rid of one unsafe, but the other 4 remains.

@ojeda
Copy link
Member

ojeda commented May 25, 2021

Related != equivalent :) Both improve/simplify the Result<Pin<Arc<Self>>> initialization. I was not implying we need to go with one or the other.

@TheSven73
Copy link
Collaborator

TheSven73 commented May 25, 2021

Currently what we do is to first unsafely create instance of them, pin them, and then unsafely initialize them after having them pinned (currently the code does not require unsafe to initialize Mutex, which I believe is a bug, as it could be initialized while locked, which is definitely unsound).

I've got to admit that, coming to this project from scratch, I was scratching my head a little bit when I saw these abstractions. I'm sure there's a reason why they are the way they are, I'm hoping to learn why.

Maybe #289 indicates that even we couldn't use this safely inside a simple driver? If we could not, what hope do driver developers new to Rust have?

@nbdd0121 this looks like a creative solution. Possibly only a partial solution though? Because people could still make structs with #[pin] annotations, then actually forget to wrap them inside a pinned pointer. And then the mutex or condvar data could still move out, and the code could still mushroom cloud?

I remember having an interesting conversation with @ojeda where he mentioned that in Rust, a little bit of inefficiency is acceptable if it allows us to improve safety. I wonder if we could make a Mutex or CondVar constructor that's entirely safe, but slightly less efficient? Something like this?

fn new_mutex(data: T) -> Result<Pin<Box<Mutex<T>>>>

Then driver authors wouldn't need any unsafe blocks. They also wouldn't have to pin their own structures, nor think deeply about pinning. They could also not move the Mutex out of the Pin<Box> accidentally, because Mutex is !Unpin.

Penalty is one extra allocation per mutex-protected structure. May be worth it in many circumstances?

Could be you folks have been here before, and the new guy is just going around in circles :)

@nbdd0121
Copy link
Member Author

I've got to admit that, coming to this project from scratch, I was scratching my head a little bit when I saw these abstractions. I'm sure there's a reason why they are the way they are, I'm hoping to learn why.

It's indeed weird! The fundamentally problem is that currently in Rust you don't have a way to create a type pinned -- you always need to create it and then pin it. E.g. Box::pin or Arc::pin requires you to have T first and gives you a Pin<Box<T>>, but for types like CondVar and Mutex, they must be pinned as soon as they are initialized, which means it'll be unsafe to create CondVar or Mutex.

Maybe #289 indicates that even we couldn't use this safely inside a simple driver? If we could not, what hope do driver developers new to Rust have?

That's a major problem. unsafe code can be hard to get right, and that's why I hope to have a safe abstraction.

@nbdd0121 this looks like a creative solution. Possibly only a partial solution though? Because people could still make structs with #[pin] annotations, then actually forget to wrap them inside a pinned pointer. And then the mutex or condvar data could still move out, and the code could still mushroom cloud?

No! The main merit of pin-init is that it is completely safe without user writing extra unsafe stuff. If you look at the signature of new_arc:

pub fn new_arc<T, E, F>(f: F) -> Result<Pin<Arc<T>>, E> 
where
    F: for<'a> FnOnce(PinInit<'a, T>) -> PinInitResult<'a, T, E>,

you can see that the return type is already pinned. In fact, the whole design about PinInit and PinInitResult is to ensure that the user can never obtain T or &mut T. If the user cannot obtain CondVar, then they have no way to construct SharedState safely themselves, and can only use init_pin! to safely create a Pin<SomePointer<T>>.

I remember having an interesting conversation with @ojeda where he mentioned that in Rust, a little bit of inefficiency is acceptable if it allows us to improve safety. I wonder if we could make a Mutex or CondVar constructor that's entirely safe, but slightly less efficient? Something like this?

fn new_mutex(data: T) -> Result<Pin<Box<Mutex<T>>>>

Then driver authors wouldn't need any unsafe blocks. They also wouldn't have to pin their own structures, nor think deeply about pinning. They could also not move the Mutex out of the Pin<Box> accidentally, because Mutex is !Unpin.

Penalty is one extra allocation per mutex-protected structure. May be worth it in many circumstances?

I don't think kernel people will like it; an allocation required for creating each !Unpin type is one drawback, but the major issue is the cache penalty due to the indirection it brings each time such types are used.

With pin-init you can see use box if you don't want to pin the outer struct:

impl<T> Mutex<T> {
    fn new(this: PinInit<'a, Self>, data: T) -> PinInitResult<'a, Self, kernel::Error> { /* ... */ }
}

try_new_box(|s| Mutex::new(s, data)) // Result<Pin<Box<Mutex<T>>>>
try_new_arc(|s| Mutex::new(s, data)) // Result<Pin<Arc<Mutex<T>>>>

@ojeda
Copy link
Member

ojeda commented May 26, 2021

Yeah, extra allocations are quite bad, and more than just a little bit of inefficiency. :(

@TheSven73
Copy link
Collaborator

Yeah I agree with you guys on this one. Gary's solution looks like the way forward. I'm still learning when it comes to pinning...

@alex
Copy link
Member

alex commented May 26, 2021

@TheSven73 if you haven't seen it yet, https://fasterthanli.me/articles/pin-and-suffering is the best resource I've read

@TheSven73
Copy link
Collaborator

Thank you @alex! I really need to understand pinning properly - the next stage of the hwrng Rust driver will depend on it.

@TheSven73
Copy link
Collaborator

@nbdd0121 How does your pin-init solution relate to pin-project ?

@nbdd0121
Copy link
Member Author

pin-project addresses how you can idiomatically and safely use pinned structs, and pin-init addresses how you can idiomatically and safely create them. I've designed pin-init so that it can be used together with pin-project.

@TheSven73
Copy link
Collaborator

Thanks! I think I'm agreeing with you that our current Mutex and CondVar abstractions are problematic - both from a "sound initialization" and "need for unsafe pin projections inside driver code" point of view. But there doesn't seem to be any group consensus right now to change anything. Perhaps that will change after the next round of LKML feedback.

I believe Wedson is adding a work-around for Mutex/CondVar init in #377. It only works if the Mutex/CondVars are wrapped inside a Ref<T>, though.

@wedsonaf
Copy link

What do you all think of wedsonaf@a369dcd? (It's a WIP, so cleanups are name changes are still needed.) Anyway, using it the example above becomes:

impl SharedState {
    fn try_new() -> Result<Pin<Ref<Self>>> {
        Ok(Ref::pinned(kernel::new_ref!(SharedState {
            [condvar] state_changed: (),
            [mutex] inner: SharedStateInner { token_count: 0 },
        })?))
    }
}

One problem that isn't solved by this yet is drop: implementers could move a condvar/mutex/spinlock and break safety. (They can already do that, but it currently requires unsafe to do structural pinning, so it is a requirement of that unsafe block; with this macro there is no explicit unsafe that we could use to remind the user not do it.)

@TheSven73
Copy link
Collaborator

This is definitely going in the right direction IMHO. But I'm not sure if this is broad enough: it would only solve our Pinning problems for Ref<T>. And so driver writers still need to use unsafe blocks, reason about pinning, etc. This seems like a case where a partial solution is really no solution at all, but I could be mistaken.

This goes for my PR #376 as well, by the way: it's not a good solution, because it's not broad enough. I've had to change my opinion there.

@nbdd0121 I still don't quite understand how your pin-init differs from a crate such as pin-project-lite? Could our Pinning problem be solved with pin-project-lite as well? If so, could we benefit from a community maintained crate which we could sync from time to time, to get the latest bugfixes? Just thinking out loud here.

@nbdd0121
Copy link
Member Author

What do you all think of wedsonaf@a369dcd? (It's a WIP, so cleanups are name changes are still needed.) Anyway, using it the example above becomes:

impl SharedState {
    fn try_new() -> Result<Pin<Ref<Self>>> {
        Ok(Ref::pinned(kernel::new_ref!(SharedState {
            [condvar] state_changed: (),
            [mutex] inner: SharedStateInner { token_count: 0 },
        })?))
    }
}

Looks great but as @TheSven73 said this is a little bit limited. Two major limitations that I can current think of:

  • data structure cannot be recursively initialized this way
  • this only works for Ref<Self>

One problem that isn't solved by this yet is drop: implementers could move a condvar/mutex/spinlock and break safety.

This is actually a significant issue. The macro assumes everything is structurally pinned, but structural pinning need safe guarding to prevent arbitrary implementation Unpin and Drop. pin-project provides such safe guard via pinned_drop attribute macro: https://docs.rs/pin-project/1.0.7/pin_project/attr.pinned_drop.html.


@nbdd0121 I still don't quite understand how your pin-init differs from a crate such as pin-project-lite? Could our Pinning problem be solved with pin-project-lite as well? If so, could we benefit from a community maintained crate which we could sync from time to time, to get the latest bugfixes? Just thinking out loud here.

As I said earlier, they solve totally different problems and are designed to be used together. pin-init allows you to safely create pinned structs, and pin-project (or pin-project-lite) allows you to safely use them. If we have pin-project in tree today, we still cannot safely create a SharedState.


@wedsonaf just in case you haven't seen my comment in #376 (proof of concept commit at nbdd0121@549b8e3), here's how you create SharedState with pin-init:

impl SharedState {
    fn try_new() -> Result<Pin<Arc<Self>>> {
        Ok(Arc::try_pin_with(init_pin!(Self {
            state_changed: kernel::condvar_new!("SharedState::state_changed"),
            inner: kernel::mutex_new!("SharedState::inner", SharedStateInner { token_count: 0 }),
        }))?)
    }
}

The POC is written before Ref so it uses Arc.

@TheSven73
Copy link
Collaborator

As I said earlier, they solve totally different problems and are designed to be used together. pin-init allows you to safely create pinned structs, and pin-project (or pin-project-lite) allows you to safely use them. If we have pin-project in tree today, we still cannot safely create a SharedState.

Thanks for explaining. pin-project[-lite] is mainly used together with Rust asynchronous code, right? How do the existing users of pin-project solve the creation issue? Or does the problem not manifest itself there?

@bjorn3
Copy link
Member

bjorn3 commented Jun 28, 2021

Most users don't actually create pinned types. Most just write async fn which internally pins the values you .await. Pretty much only runtimes (like tokio, async-std, ...) and future combinators (like in the futures crate) really need to interact with pinning themself.

As for the drop problem, pin-project prevents you from implementing Drop yourself by implementing inside the macro. Instead you have to do something like the following (taken from the docs):

#[pin_project(PinnedDrop)]
struct PrintOnDrop {
    #[pin]
    field: u8,
}

#[pinned_drop]
impl PinnedDrop for PrintOnDrop {
    fn drop(self: Pin<&mut Self>) {
        println!("Dropping: {}", self.field);
    }
}

@nbdd0121
Copy link
Member Author

Pinning is introduced to Rust to solve self-references in async functions.

async fn foo() {
	let x = 1;
	let y = &x; // <- This `y` is used across `.await` so it needs to be stored in the future, but it references `x` so it's a self reference.
	bar().await;
	let z = y;
}

Async functions, however, will not do anything unless polled for the first time. So when a Future is created, nothing is executed yet, so there are not yet any self-references. It can therefore be moved freely. When it's polled for the first time, Pin<&mut Self> is given, so it can thus create self-references.

Our use case, however, is different from async fn; we require the struct to be pinned at creation rather than pinned after creation.

@y86-dev
Copy link
Member

y86-dev commented Jul 25, 2024

Stale issue: we have the pinned-init solution in-tree.

@y86-dev y86-dev closed this as completed Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
• lib Related to the `rust/` library.
Development

No branches or pull requests

7 participants