Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Allow for static, freeable, String for fast handoff to Ruby #6

Closed
wants to merge 2 commits into from

Conversation

ianks
Copy link
Contributor

@ianks ianks commented May 27, 2022

So I've been thinking about devising a mechanism to hand off a Rust String to Ruby in a way that:

  1. Avoids any memcpy on the Ruby side
  2. Freeable by the Ruby GC (so we don't leak memory)

This would allow for extremely fast, zero-copy handoffs of strings to Ruby.

I have a proof of concept in this PR, and I think it's close. The issue I'm running into is since RString implements Copy (and wisely so, I think), it is possible to attempt to access a potentially freed pointer on the RString.

Unfortunately, I do not really know of a good way around it... Was curious to hear if you had any ideas or thoughts 😄

@matsadler
Copy link
Owner

It would be so cool if it was possible to get this working.

The following is me just writing things out as I'm thinking though it, sorry it's long and rambling.

For some reason I was under the impression that Ruby strings (despite storing their length) had to be NULL terminated, but now I can't find any reference to that.
There are a bunch of references to the Ruby itself creating NULL terminated strings (example), but nothing saying they must be, and other docs seem to suggest there's no guarantee.

If that's not an obstacle, then it's the freeing the string that's the tricky part.

It's not safe for Ruby to use its allocator to free memory Rust has allocated, as there's no guarantee they are the same allocator.

As soon as the RString is created (yep, before it's even returned, thanks to ObjectSpace) it's basically impossible to know who has a reference to it, so there's no way to take it back from Ruby.

I think the safe way would be to use a finaliser, this is basically how Ruby lets you handle freeing C/Rust data 'wrapped' in a Ruby object. Finalisers are called some time after the object has been GC'd so you know Ruby doesn't have any references, and neither should any (correctly written) C/Rust extensions. Unfortunately the finaliser API for plain Ruby objects isn't great, and I've not implemented the API for it yet.

The signature is

VALUE rb_define_finalizer(VALUE obj, VALUE block);

so you first have to construct a block, which has the signature

VALUE rb_proc_new(rb_block_call_func_t func, VALUE callback_arg);

where rb_block_call_func_t is a function with the signature

VALUE rb_block_call_func(VALUE yielded_arg, VALUE callback_arg, int argc, const VALUE *argv, VALUE blockarg)

So the finaliser block is called without any arguments, so yielded_arg and blockarg will be nil, and with argc 0 and argv NULL, so we can ignore everything apart from callback_arg.

So we get to provide a function pointer func and a callback_arg that will be passed to func when it's time to finalise the object.

If we just had some data on the heap in a Box this'd be great, we'd turn the Box into a pointer, pass that as callback_arg, then func would be a function that turns the pointer back into a Box and then drops it.

The problem with String is it's not just a pointer, it's a pointer, a length, and a capacity, and we can't fit that into callback_arg.

Ruby has a few APIs like this, and the way to work around it is you pass a pointer to a closure as callback_arg, then func invokes the closure. Rust closures are basically structs containing all the variables they have captured. The size is known at compile time, so they are can be a thin pointer, and can carry along whatever sized variables they want.

This is easy in say rb_protect where the function/closure will be invoked immediately, you can just give a pointer to the closure on the stack. However with the finaliser the function/closure will be invoked at some point in the future, we can't use a stack pointer. This means we have to Box the closure.

Aside: I have just realised I screwed up Value::block_call and Proc::new, and they should Box their closure, and then have a finaliser to drop it. I need to fix that.

Once we're heap allocating a closure, and relying on Ruby's finaliser API, which isn't really designed to handle more than a small amount of finalisers, and slows down the GC, it might be that we've lost the benefits of not copying the string.

But if you'd still like to give it a try, I think the function for setting a finaliser should have the signature (in the gc mod)

pub fn define_finalizer<T, F>(value: T, func: F) -> Result<(), Error>
where
    T: Deref<Target = Value>,
    F: FnOnce(),
{
    todo!();
}

It could probably just use Proc::new to convert func to a proc, assuming Proc::new is fixed.

and then you could do something like:

// I think this name and signature make it clear it's taking ownership of `s`
// to convert it to a `Self`
pub fn from_string(s: String) -> Self {
    let ptr = s.as_ptr();
    let len = s.len();
    let r_string = unsafe { Self::new_lit(ptr as _, len as _) };

    // as far as Rust knows the ownership of `s` is moved into the
    // finaliser and it'll leave the memory `s` points to alone until
    // the finaliser is run. The finaliser func doesn't do anything with
    // `s`, it'll just drop it once it's run.
    // The finaliser is run some time after the string is GC'd, so nothing
    // will still be using it
    crate::gc::define_finalizer(r_string, move || {
        // I think this is how you'd force something to move into a closure
        // but I don't remember and haven't tested this
        let _s = s;
    }).unwrap(); // define_finalizer can fail, we don't expect it to here

    r_string
}

Some other things to check would be how this interacts with cloning/duping the string in Ruby, and methods that return copy-on-write references to the original string (I think #slice does this).

@matsadler
Copy link
Owner

I figured out another way to tie the lifetime of some Rust data to a Ruby object.

Using rb_data_typed_object_wrap you can wrap some data (behind a pointer, so you do need to allocate) in a Ruby object. If you pass 0 as the class the object won't have a class, so can't be seen by ObjectSpace. You can then set that object as an instance variable of the object you want to associate its lifetime with. If you use a name for the instance variable with the @ prefix it's invisible to Ruby.

There an example in these changes: 91ee513#diff-676d4870958cd433b65507e695efedb29f7a535ba9551cf276e21f1b8e2f25eaR149

@ianks
Copy link
Contributor Author

ianks commented May 31, 2022

For some reason I was under the impression that Ruby strings (despite storing their length) had to be NULL terminated, but now I can't find any reference to that.

I'm pretty sure the strings do not have to be null terminated, since it is actually fully possible to store null bytes in a Ruby string.

It's not safe for Ruby to use its allocator to free memory Rust has allocated, as there's no guarantee they are the same allocator.

Can you give an example here?

@matsadler
Copy link
Owner

Just a note that I do plan to get back to you on this, but my personal life (newborn baby) is currently occupying all my time.

@ianks
Copy link
Contributor Author

ianks commented Jun 6, 2022

Congrats on the newborn! Take your time far the most important things ❤️

@matsadler
Copy link
Owner

It's not safe for Ruby to use its allocator to free memory Rust has allocated, as there's no guarantee they are the same allocator.

Can you give an example here?

So I'm far from an expert, I don't even really have any hands on experience, I'm just piecing things together from what I've read.


Let's make sure we're on the same page to start.

My understanding of the code here is that we're using rb_utf8_str_new_static (via the undocumented RString::new_lit) which creates a new Ruby String object header that points at an existing memory buffer for the actual contents of the string.

rb_utf8_str_new_static assumes the pointer it's given is to a string literal that is stored statically in a different memory segment, so will also mark the String object with the STR_NOFREE flag so when the String object is GC'd it will skip freeing the buffer containing the string data.

The changes also provide a method to clear the STR_NOFREE flag, so that the buffer will be freed on GC.

So we start with something like this:

                |
---Rust Stack---+---Rust Heap-----------------
                |
   +--------+   |
   | String |   |
   +--------+   |
   | ptr ---+---+---> "Hello, world!"
   | len    |   |
   | capa   |   |
   +--------+   |
                |

Then we call rb_utf8_str_new_static(s.as_ptr, s.len()) and end up with this:

                |
---Rust Stack---+---Rust Heap-----------------
                |
   +--------+   |
   | String |   |
   +--------+   |
   | ptr ---+---+---> "Hello, world!" <---+
   | len    |   |                         |
   | capa   |   |                         |
   +--------+   |                         |
                |                         |
---Ruby Stack---+---Ruby Heap-------------+---
                |                         |
   +-------+    |     +--------+          |
   | Value +----+---> | Object |          |
   +-------+    |     +--------+          |
                |     | flags  |          |
                |     | klass  |          |
                |     +--------+          |
                |     | len    |          |
                |     | ptr ---+----------+
                |     | capa   |
                |     | shared |
                |     +--------+
                |

Then we forget the Rust string and get:

                |
---Rust Stack---+---Rust Heap-----------------
                |
                |     "Hello, world!" <---+
                |                         |
---Ruby Stack---+---Ruby Heap-------------+---
                |                         |
   +-------+    |     +--------+          |
   | Value +----+---> | Object |          |
   +-------+    |     +--------+          |
                |     | flags  |          |
                |     | klass  |          |
                |     +--------+          |
                |     | len    |          |
                |     | ptr ---+----------+
                |     | capa   |
                |     | shared |
                |     +--------+
                |

The boundary between the "Rust Heap" and "Ruby Heap" is kind of weird, as everything is running in the same process, so it's the same heap, but there is potential for the memory to be managed by different allocators.

Ruby's configure script has an option to compile with jemalloc, and before that was added it was a common optimisation to patch Ruby with a faster allocator (usually the same jemalloc).

Rust used to exclusively use jemalloc, but as of 1.28 the GlobalAlloc trait was stabilised and now it's pretty trivial to swap to another allocator. As of 1.32 the default allocator changed to the system allocator.

These two together mean that while it's pretty common for both Ruby and Rust to be using the system allocator, it's also quite possible they will be using different allocators.

The really hard part is knowing when both are using the same allocator. It's not really possible when developing a publicly available extension gem - users can do what they want. I'm not sure documenting it is even enough - I don't think there's many developers at my job who realise the Ruby we're using in production has been compiled with jemalloc.

There's lots of bits of documentation that suggest memory allocated with a particular allocator should only ever be freed with the same allocator's free function. For example, Ruby's ruby_xmalloc, ruby_xfree, and Rust's GlobalAlloc::dealloc all make claims to that effect.

A particularly good example is Rust's (currently unstable) String::into_raw_parts which says:

After calling this function, the caller is responsible for the memory previously managed by the String. The only way to do this is to convert the raw pointer, length, and capacity back into a String with the from_raw_parts function, allowing the destructor to perform the cleanup.

This is particularly relevant, as what we're trying to do here is the kind of thing String::into_raw_parts looks like it's designed for.

My rough understanding of why one allocator can't free memory allocated by another is that the two allocators could have different internal bookkeeping, and freeing memory an allocator is unaware of could corrupt that bookkeeping.

For example, say allocator A just directly makes syscalls to request memory for each allocation, and allocator B pre-allocates a block of memory, then hands out chunks of that for each allocation. Allocating with A and then freeing with B will corrupt B's record keeping.


So to me this all adds up to, there's no guarantee the allocator Rust uses to allocate the string and the allocator Ruby uses to free the string are the same, so there's no way to safely do what these changes are currently doing.

The safe way would be to - when Ruby is done with the string - reconstitute it into a Rust String, then drop it. The obvious api for this is Ruby's finalisers, but after a quick benchmark the finaliser API is devastatingly slow so of no use in this case.

The other workaround I can think of - wrapping a Rust struct holding the string in a Ruby object and assigning that to an ivar of the String that'll be GC'd at the same time as the String - is also too much of a slowdown to be useful.

Part of the problem seems to be, there isn't actually that big of a speedup to be gained - the conversion from Rust String to Ruby String is already pretty quick. So there's basically no room for a workaround for freeing the string.


I'd still love to find a way to get this working1, I just can't see a path to it myself.

1. The reverse of this, RString::as_str, is one of my favourite bits of Magnus, so it'd be awesome to go both ways.

@ianks
Copy link
Contributor Author

ianks commented Jun 11, 2022

This has got me thinking…

In rb-allocator we decided to simply report memory usage due to ruby_xmalloc trigger GC runs (which will cause bad things to happen with Rust Rc‘s and such.

We even thought that reporting the memory usage might be unsafe, but it turned out GC is not triggered if realloc flag was set.

I wonder if we used ruby_xrealloc in place of ruby_xmalloc (via using a null ptr), we we can avoid GC being triggered during allocation.

Although dubious, if that’s the case we may be able to use the same allocator in Rust + Ruby…

@matsadler
Copy link
Owner

There's a currently unstable Rust feature called allocator_api that adds an Allocator trait (as opposed to GlobalAlloc) and new_in functions for a number of collections for creating that type with a specific allocator. Here's the PR for String: rust-lang/rust#79500.

The especially useful part is the allocator becomes a type parameter (with a default of Global), so we could add a method like:

pub fn from_string(s: String<RbAllocator>) -> RString {
    ...
}

and our requirement that the string has been allocated with the same allocator as Ruby is enforced by the type system.

The allocator_api feature also solves the problem rb-allocator was having, as you can pick and choose when you use that particular allocator, rather than having it faced on everything globally, so you can just no use it in the places where it'd cause problems.

But I guess we'd be waiting on the Rust PR to add allocator support to String before it's even possible to prototype this on nightly.

@ianks ianks closed this Jul 5, 2022
@ianks
Copy link
Contributor Author

ianks commented Sep 28, 2022

So I stumbled upon another interesting way to do this type of thing. Fiddle::MemoryView#to_s creates a static Ruby string and sets an ivar on the RString to reference the memory view.. This pattern should allow for zero copy string views.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants