-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better allocation signatures? #26
Comments
Thanks for looking into this. I would love to find ways to remove refcell or improve the code. I don't see a way to get rid of the |
Hmm, Perhaps one can use HRTBs to get rid of rebind? we tag ContextWrapper with a 'b which changes on gc. we only allow objects to be accessed given a ContextWrapper of the same tag. struct TokenInit<'b> {
marker: PhantomData<fn(&'b ()) -> &'b ()>,
}
struct BrandedObject<'b, T> {
x: T,
marker: PhantomData<fn(&'b ()) -> &'b ()>,
}
struct ContextWrapper<'ob, 'rt, 'b> {
cx: &'ob mut Context<'rt>,
marker: PhantomData<fn(&'b ()) -> &'b ()>,
}
fn alloc<'b, T>(cx: &ContextWrapper<'_, '_, 'b>) -> BrandedObject<'b, T> {
todo!()
}
fn borrow_mut<'a, 'b, T>(
obj: &'a mut BrandedObject<'b, T>,
cx: &'a ContextWrapper<'_, '_, 'b>,
) -> RefMut<'a, T> {
todo!()
}
fn garbage_collect<'ob, 'rt, 'b, 'c>(
cx: ContextWrapper<'ob, 'rt, 'b>,
init: TokenInit<'c>,
) -> ContextWrapper<'ob, 'rt, 'c> {
todo!()
}
fn with_token<T, F: for<'b> FnOnce(TokenInit<'b>) -> T>(f: F) -> T {
f(TokenInit {
marker: PhantomData,
})
}
fn function_that_gcs<'ob, 'rt, 'b, 'c>(
obj: BrandedObject<'b, i32>,
init: TokenInit<'c>,
cx: ContextWrapper<'ob, 'rt, 'b>,
) -> (BrandedObject<'c, i32>, ContextWrapper<'ob, 'rt, 'c>) {
todo!()
}
trait Branded<'b> {
type Type;
}
fn loop_with_token<'rt, 'ob_outer, 'b_outer, 'c_outer, B, F>(
cx: ContextWrapper<'ob_outer, 'rt, 'b_outer>,
init_token: TokenInit<'b_outer>,
init: <B as Branded<'b_outer>>::Type,
f: F,
) -> (
ContextWrapper<'ob_outer, 'rt, 'c_outer>,
<B as Branded<'c_outer>>::Type,
)
where
B: for<'b> Branded<'b>,
F: for<'b, 'c, 'ob> Fn(
TokenInit<'c>,
ContextWrapper<'ob, 'rt, 'b>,
<B as Branded<'b>>::Type,
) -> (ContextWrapper<'ob, 'rt, 'c>, <B as Branded<'c>>::Type),
{
todo!()
} This has the bonus that we can pass non-rooted objects as arguments. Rather verbose though. Edit: must use invariant marker |
You will have to help me break this down. I have a few questions. When you say reduce pointer indirection, where are you talking about specifically? Objects themselves are tagged pointers that don't have any indirection. Rooted objects do have a level of indirection, but that is because it allows us to implement a moving collector. We can (and I plan to) change the As far as HRTB's are concerned. I am try to see where those would be helpful. One of the key features of the current approach is that we can statically prove that all roots are accounted for. This is because all objects have their lifetimes tied to But you do present a really interesting alternative to let x = rebind!(function_that_gcs(cx), cx) into this: let (cx, x) = function_that_gcs(cx) I honestly had not thought of that approach before. I wonder if there are corner cases that would make it harder to work with. Using your approach would both get rid a macro and some gnarly unsafe code. |
Note: above markers fixed to be invariant HRTB is used to enforce safety. fn function_that_gcs<'ob, 'rt, 'b, 'c>(
obj: &BrandedObject<'b, i32>,
init: TokenInit<'c>,
cx: ContextWrapper<'ob, 'rt, 'b>,
) -> (BrandedObject<'c, i32>, ContextWrapper<'ob, 'rt, 'c>) {
todo!()
}
fn function_that_gcs2<'ob, 'rt, 'b, 'c>(
obj: BrandedObject<'b, i32>,
init: TokenInit<'c>,
cx: ContextWrapper<'ob, 'rt, 'b>,
) -> (BrandedObject<'c, i32>, ContextWrapper<'ob, 'rt, 'c>) {
with_token(|init2| {
let mut obj = obj;
borrow_mut(&mut obj, &cx); //Ok
let (mut obj2, cx) = function_that_gcs(&obj, init2, cx);
borrow_mut(&mut obj2, &cx); //Ok
// borrow_mut(&mut obj, &cx); //error[E0521]: borrowed data escapes outside of closure
function_that_gcs(&obj2, init, cx)
})
} so you would need a &context to actually access the fields of obj.
The way this works is that we force the new cx to be tied to the TokenInit which we only allow to generate through a HRTB ( By pointer indirection, I meant allocation performance, i.e. one hopefully don't need to walk 3 pointers to get the Block. Sorry this wasn't clear |
Thanks. I was able to test your example, and it works like you said. I still don't understand what this approach is trying to accomplish. Is the sole purpose to reduce pointer indirection to the |
Sorry if this wasn't clear, with that I wasn't trying to accomplish anything particular; rather was an experiment to try to not have
I don't think the advantages outweigh having to make a closure for every call, but it might be useful if we find a way to achieve the same without closures. |
also, maybe unrelated question: why cant Context own RootSet? |
Ah, that makes more sense.
Because stack roots need a reference to the Lines 113 to 116 in db0ac8a
If the
Unfortunately transmute does not work here because we don't want to change the lifetime itself, we want to disassociate two lifetimes so that the mutable borrow is can be dropped. Your original strategy of passing the
That would be big if we could find some way to accomplish that. However I wouldn't want to sacrifice a lot of code readability or idomaticness to do so. |
I see, I think this is sensible now.
Its possible to transmute through 'static like so: pub(crate) unsafe trait Rebindable<'b> {
// SAFETY:
// Type must not contain any field of type &'b mut T
type Type;
}
unsafe impl<'b, T: Rebindable<'b>> Rebindable<'b> for Gc<T> {
type Type = Gc<<T as Rebindable<'b>>::Type>;
}
pub(crate) unsafe fn unbind_for_rebind<'b, T>(
val: <T as Rebindable<'b>>::Type,
) -> <T as Rebindable<'static>>::Type
where
T: for<'c> Rebindable<'c>,
{
mem::transmute(val)
}
pub(crate) unsafe fn rebind_by_ref<'c, T, R>(
_cx: &'c T,
val: <R as Rebindable<'static>>::Type,
) -> <R as Rebindable<'c>>::Type
where
R: for<'b> Rebindable<'b>,
{
mem::transmute(val)
}
#[macro_export]
macro_rules! rebind_new {
($type:ty, $cx:ident, $value:expr) => {{
let value = $value;
unsafe {
let value_as_static = $crate::core::gc::unbind_for_rebind::<$type>(value);
$crate::core::gc::rebind_by_ref::<_, $type>($cx, value_as_static)
}
}};
}
// example:
// let obj = rebind_new!(Gc<RebindObj>, cx, self.implicit_progn(iter, cx)?); I believe one can make this even safer by making a closure in the macro which makes unbind_for_rebind only transmute the lifetime that comes from the input
Unfortunately I don't think this accomplishes anything on its own. If we return something that contains &mut Context, we cannot also return any objects that borrows from it. The HRTB only works due to using types instead of borrowing to enforce safety.
Agree. |
I think I have a way to (cleanly) allow non-rooted objects as arguments now. We can try to pass around contexts that contains objects: struct ContextWrapper<'ob, 'rt, T> {
// or maybe just an owned Context?
cx: &'ob mut Context<'rt>,
// Non-rooted obj that can be thought of as immutably borrowed from cx
obj: T,
} we provide these apis: // deref
fn(&ContextWrapper<T>) -> &T;
// derefmut
fn(&mut ContextWrapper<T>) -> &mut T;
fn borrow_ctx(&ContextWrapper) -> &Context;
// Need a (safe) trait for T<'a> and R<'a> to be a thing
fn(ContextWrapper<T>, for<'a> FnOnce(&'a Context, T<'a>) -> R<'a>) -> ContextWrapper<R>;
// return local root as non-root
fn(ContextWrapper<_>, Rt<R>) -> ContextWrapper<R>; and functions will look like fn function_that_may_gc(ContextWrapper<T>) -> ContextWrapper<R> GC by converting ContextWrapper into &mut Context by throwing away obj. Not sure how well Rust can optimize ContextWrapper; there might be a case for a separate zero-sized token for permission |
Actually, that was unsound since it cannot prevent the user from storing an &Context inside the T of ContextWrapper. Using Token doesn't have this problem. We have instead //unique for each context
struct State<T> {
obj: T,
}
// 'a is mandatory to prevent user from moving an object out of the closure
fn(State<T>, for<'a> FnOnce(Token<'a>, T<'a>) -> R<'a>) -> State<R>;
// you can get a bool or int out or just do mutations
fn(&State<T>, for<'a> FnOnce(Token<'a>, &T<'a>) -> R) -> R;
fn(&mut State<T>, for<'a> FnOnce(Token<'a>, &mut T<'a>) -> R) -> R;
// Get out a single Object which can then be rooted
fn(&b State<T>, for<'a> FnOnce(Token<'a>, &mut T<'a>) -> Object<'a>) -> Object<'b>;
// Token is Copy
fn alloc(Token<'a>, &mut Context) -> Object<'a>;
fn from_root(Token<'a>, &mut Rt<T>) -> T<'a>;
// clear out state
fn garbage_collect(State<_>, &mut Context) -> State<()>
fn function_that_may_gc(State<T>, &mut Context) -> State<R> The user is free to keep a Token in T if they want, it wont accomplish anything and wont be unsound |
You can transmute through a static, but it still doesn't dissociate the lifetimes. Take this sample code use std::mem;
#[derive(Debug)]
struct Gc<'a>(&'a usize);
struct Context(usize);
pub(crate) unsafe fn unbind_for_rebind<'b>(val: Gc<'b>) -> Gc<'static> {
mem::transmute(val)
}
pub(crate) unsafe fn rebind_by_ref<'c>(_cx: &'c Context, val: Gc<'static>) -> Gc<'c> {
mem::transmute(val)
}
fn test_fn(cx: &mut Context) -> Gc<'_> {
Gc(&cx.0)
}
fn main() {
let cx = &mut Context(7);
let obj = test_fn(cx);
let value_as_static = unsafe { unbind_for_rebind(obj) };
let value = unsafe { rebind_by_ref(cx, value_as_static) };
test_fn(cx);
println!("{:?}", value);
} You get this error:
I am still trying to grok the token solution you presented. I will have to look at it more. You seem to understand the type system better and how it can be used. |
This is as intended, right? We rebind! to get a non-rooted value, which is not supposed to live past a gc.
I have actually only been learning more about the type system quite recently! It helps a lot to look at some self-referential crates; There seem to be a lot of them, with varying degrees of soundness... The key idea is really here: fn garbage_collect(State<_>, &mut Context) -> State<()>
fn function_that_may_gc(State<T>, &mut Context) -> State<R> Since we don't expose any State constructor, anything we keep in a State could not have been gc'ed, thus any non-rooted objects in State must still be valid.
One way to do this is to rebind Object into State using a macro. This is fine, but it requires writing some unsafe code for any type T for State. |
rebinding and rooting are actually separate concerns. Rebind is only to work around this issue with the type system. Essentially we need to rebind whenever a function that takes a
So it sounds to me like So is this similar to cell-gc or gc-arena where they use a closure to create "heap mutation sessions" where no gc can occur and therefore it is safe to operate on variables? The closure keeps variables from escaping, which sounds like what you are trying to do with closures as well. That is my understanding of the |
But this can't be right? what prevents an object returned by rebind! from being garbage collected?
|
My understanding is that rebind is supposed to change a mutable borrow to an immutable one, which prevents a gc but allows allocation |
oh snap you’re right! That’s not good. I think the issue is |
Okay I fixed the unsoundess bug you found. thank you! The example you provided now correctly fails to compile.
because it is (now) tied to the immutable borrow of Like the lifetime misconception guide says "The re-borrowed shared ref has all the cons of a mut ref and all the cons of a shared ref and has the pros of neither." We use |
Nice! the transmute I gave above does the same thing, but you can also make them work with, say, vectors.
The Token approach is I think only similar in terms of the safety bound.
State is the opposite of root. In a garbage_collect, State is cleared: fn garbage_collect(State<_>, &mut Context) -> State<()> Due to this, you can stick anything in State: A RefCell of a closure that borrows from a refcell of a closure. whatever. The Token is there to prevent you from copying something out of state, do gc, and put that back into state. |
Can you make this sample code compile using transmutes? This will work with the raw conversion method. But I couldn't make it work with transmute.
Hmm, interesting. So if I wanted to pass multiple arguments to a function I would create |
This isn't supposed to compile, right? If it does, you would garbage collect obj (its not rooted) and then print the dangling pointer.
When I wrote that API I imagined functions to look like this: fn func_that_gc(state, cx) -> State<obj>
{
let state: State<(obj, obj)> = function_that_gc_1(state, cx);
let state: State<Vec<obj>> = function_that_gc_2(state.map(|(x, y), token| (x.car(), cons!(x.cdr(), y), rooted.bind(token))), cx);
let state: State<obj> = state.map(|v, token| vector_to_list_nogc(v, token, cx));
state
} Where cons! wraps a alloc_cons(cx, token). Note that you can still pass roots as arguments ofc. I think you are right that "rebinding" is more suitable here: // obj1 and obj2 borrow from cx or empty_state (bad, this must use macros) or another global token that also inhibits gc
let (empty_state, (obj1, obj2)) = function_that_gc_1(state, cx).split_state(cx)
function_that_gc_2(to_state(empty_state, x.car(), cons!(x.cdr(), y), rooted.bind(cx)), cx) Kind of a seperate idea is I think ideally we use |
Touche. Turns out you were right on both accounts. That code shouldn't compile and transmute does work for
Hmm that is really interesting. It has the advantage that you can pass arguments without rooting them and you don't need to rebind the return value. I wonder if rust would optimize a struct of a tuple for the argument as well as it optimizes individual arguments. Seems like the tradeoff is increased verbosity though.
Are you thinking of if it makes a difference for performance? Allocation will probably be on the hot path so I think it would. But another way we could solve this is use |
See discussion with @Alan-Chen99 in #26
UnsafeCell and &mut should make a difference in theory. In practice it doesn't (for now), due to things like rust-lang/rust#71354. It will probably make a difference in future versions of rust/llvm |
Yea, the verbosity is no good. If we do adopt this we need a good abstraction over this using macros. I will need to think more on that. |
It's unfortunate that allocation must go through refcell.Perhaps we can do better by using some sort of ghosttoken?
This way we don't need to rebind! as the objects will borrow from the token instead.
a root will take a &Object and produce something not borrowed from token.
when we garbage_collect, we take a
&mut token
ensuring we rooted any pointers to the gc heap.Hmm ig this doesn't actually work. didn't really thought this through
The text was updated successfully, but these errors were encountered: