Add __rust_unsized_deallocate for C libraries to hook into. #31976

michaelwu · 2016-02-29T20:13:19Z

__rust_deallocate requires a size to be passed in, especially when jemalloc is used. Most C/C++ libraries don't use sized deallocation, so this allows non-rust code to hook into Rust's allocator without changing every free() call.

Haven't tested this beyond building yet, but I'm putting this PR out first to see if the idea sounds ok. I'm adding sized allocation support to Spidermonkey https://bugzilla.mozilla.org/show_bug.cgi?id=1250998 but not everything can easily be converted to sized deallocation. The _rust* functions are being used to hook into rust's allocator.

__rust_deallocate requires a size to be passed in, especially when jemalloc is used. Most C/C++ libraries don't use sized deallocation, so this allows non-rust code to hook into Rust's allocator without changing every free() call.

rust-highfive · 2016-02-29T20:13:33Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nikomatsakis (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

sfackler · 2016-02-29T20:25:26Z

Can regions allocated with the sized API be deallocated with the unsizzed API? I wouldn't expect that they could - isn't the point of sized allocation that the allocator can omit the size header?

michaelwu · 2016-02-29T20:34:41Z

All allocations are sized, and I don't see a way to request that jemalloc not store metadata about a particular allocation. As I understand it, the sized deallocation API is an optimization to avoid looking up the metadata, rather than not storing metadata at all.

AIUI modern allocators do not use size headers either.

alexcrichton · 2016-02-29T23:39:37Z

I think that this may be the wrong use of these libraries, the double-underscore in front of these symbols really is intended to mean "you should not call these functions", and it's true that you should not call these functions or rely on their existence.

In terms of functionality, this also seems like it may not be the intended use case of these functions either? Could you elaborate a bit on why external languages are being fused to the Rust allocator? The current intention is that it'd go the other way around with this support existing to fuse Rust to some other allocator.

michaelwu · 2016-03-01T00:03:49Z

Using double underscore functions is a bit weird, but it's required even if you wanted to plug in an allocator into rust rather than the other way around - https://doc.rust-lang.org/book/custom-allocators.html#writing-a-custom-allocator . It seems like we already rely on the existence of these symbols.

As to attaching external language to Rust's allocator - I think it's not so unusual when you look at C/C++ libraries. A large number of libraries - zlib, libpng, libjpeg, and of course, spidermonkey, provide ways to attach them to custom allocators. They don't really have their own allocator, just some default stubs that glue into libc if nothing else is specified. If we don't have an appropriate API here to plug into, then everyone who wants to make sure their rust code is sharing the same allocator as C/C++ code has to provide their own allocator.

huonw · 2016-03-01T01:41:09Z

As @sfackler says, allowing "standard" use of non-sized deallocation seems unfortunate: without that possibility, Rust could theoretically use an allocator that basically stores no information about allocated pointers internally (jemalloc has internal side tables, instead of headers), relying on the fact that we always pass back in a valid pointer along with its size.

It seems like this might be possible via a custom allocator that exposes this sort of function itself?

alexcrichton · 2016-03-01T17:32:33Z

It seems like we already rely on the existence of these symbols.

I suspect that any stable form of this API would hide this implementation detail. It's not intended to be a public thing that's typed.

If we don't have an appropriate API here to plug into, then everyone who wants to make sure their rust code is sharing the same allocator as C/C++ code has to provide their own allocator.

I agree with @huonw here that this sounds like a case for shims rather than a new symbol that can be hard-wired to.

michaelwu · 2016-03-01T18:27:00Z

So what's the alternative? It would be best to have all code use the same allocator, and there isn't much of a reason for most people to use anything other than what rust comes with. It seems broken to require a custom allocator in order to get code to share allocators.

I can't think of how a reasonable allocator that stores no metadata could exist. Doing better than a size header is possible, and is already done, but at some point you need to know what's allocated.

nagisa · 2016-03-01T18:59:28Z

I can't think of how a reasonable allocator that stores no metadata could exist. Doing better than a size header is possible, and is already done, but at some point you need to know what's allocated.

You can expect most Rust code to reasonably remember the sizes of allocations by itself. E.g. Box<[T]> allocates slice_len * size_of::<T>() and Box<T> allocates size_of::<T>.

I share my surprise these symbols exist in the first place and I share the sentiment about unsized allocation/deallocation functions not being useful for the Rust itself.

michaelwu · 2016-03-01T19:19:47Z

You can expect most Rust code to reasonably remember the sizes of allocations by itself. E.g. Box<[T]> allocates slice_len * size_of::() and Box allocates size_of::.

Sure, but when it comes time to make a new allocation, the allocator need to know what memory is and isn't being used, which requires the allocator to have some idea of how big any particular allocation is.

michaelwu · 2016-03-01T19:25:46Z

I share my surprise these symbols exist in the first place and I share the sentiment about unsized allocation/deallocation functions not being useful for the Rust itself.

Yeah Rust doesn't need this. This is for C/C++ libraries, so I didn't add anything to liballoc.

nagisa · 2016-03-01T19:37:17Z

Yeah Rust doesn't need this. This is for C/C++ libraries, so I didn't add anything to liballoc.

It seems to me this is for liballoc::heap’s exclusive use, actually (hence the __rust prefix).

alexcrichton · 2016-03-02T18:06:52Z

@michaelwu the alternative is what @huonw mentioned which is to build that allocation on top of what the underlying API already gives you.

michaelwu · 2016-03-02T18:55:44Z

And store the size for every pointer? That seems like a very poor solution. The alternative is to look up the size of the pointer, but no API exists for that either.

nagisa · 2016-03-02T19:19:31Z

The alternative is to look up the size of the pointer, but no API exists for that either.

Box deallocation code uses pretty much an equivalent of https://gist.github.com/89b0e226909482bd935a to “lookup” the size.

michaelwu · 2016-03-02T19:52:56Z

Yeah but the free API we want to implement takes untyped void* pointers. That particular strategy doesn't work here. In SM, I've already converted frees of typed pointers into sized frees where it works reasonably (which is most of it!). That covers nearly everything, but then there are weird classes that are manually allocated and things that are in the public API which are less trivial to convert to sized allocation.

alexcrichton · 2016-03-03T00:50:11Z

@michaelwu basically, yeah. This is the API for all allocators in Rust, and it also seems unfortunate two force support for two modes of allocating?

michaelwu · 2016-03-03T23:21:26Z

There isn't two modes of allocating though. A reasonable allocator needs to store enough data to find the rough size of any given valid pointer or it won't be able to quickly reuse freed space. Sized deallocation is strictly a performance optimization for any reasonable allocator you would use with this interface. (Edit - I take this back, I can think of one reasonable way of doing this, but all current allocators that I know of also know the sizes of allocations) FWIW - finding the rough size of a pointer is another missing thing in this api, so HeapSize needs to make assumptions about the allocator to hook into the right functions https://doc.servo.org/src/heapsize/lib.rs.html#42-63 . OSX, glibc, bionic, musl, jemalloc, tcmalloc, dlmalloc, and windows all provide functions for determining the usable size of a pointer.

nagisa · 2016-03-04T11:18:39Z

OSX, glibc, bionic, musl, jemalloc, tcmalloc, dlmalloc, and windows all provide functions for determining the usable size of a pointer.

And some toy allocator somebody implements at some point in the future won’t. Instead it would rely on Rust always passing in correct sizes. With this change in place allocators like that won’t be possible anymore.

nox · 2016-03-07T15:09:46Z

We (Servo) do need to know the actual allocated size of pointers and boxes and things like this. Cf the heapsize crate and #32075.

You can expect most Rust code to reasonably remember the sizes of allocations by itself. E.g. Box<[T]> allocates slice_len * size_of::<T>() and Box<T> allocates size_of::<T>.

And then you underestimate all memory usage, because that doesn't account for jemalloc rounding up, or the Windows Heap API shenanigans with alignment.

nagisa · 2016-03-07T17:21:48Z

We (Servo) do need to know the actual allocated size of pointers and boxes and things like this.

I could get behind this need/want. We’ve also wanted something similar in the standard library for Vec capacity at some point.

And then you underestimate all memory usage, because that doesn't account for jemalloc rounding up, or the Windows Heap API shenanigans with alignment.

Isn’t that something the allocation library should be abstracting itself? I’d be seriously surprised if jemalloc_sdallocx(jemalloc_mallocx(size, align), size, align); (or the equivalent with any other allocation API) leaked memory in any circumstances.

nox · 2016-03-07T17:24:31Z

Isn’t that something the allocation library should be abstracting itself? I’d be seriously surprised if jemalloc_sdallocx(jemalloc_mallocx(size, align), size, align); (or the equivalent with any other allocation API) leaked memory in any circumstances.

I am not sure what you mean by leaking memory. What I was referring to is that both jemalloc and Windows' Heap API require to overallocate in some cases, size_of::<T> doesn't know about that.

nagisa · 2016-03-07T17:34:24Z

What I was referring to is that both jemalloc and Windows' Heap API require to overallocate in some cases

Yes, they do. However, Rust won’t request more than strictly necessary (i.e. it would call mallocx(size_of::<T>())) and whatever over-allocation the allocation libraries do are strictly the responsibility of allocation libraries themselves¹, right? I either do not see how sdallocx(size_of::<T>(), min_align_of::<T>()) is not sufficient as an implementation of “unsized deallocation” or I do not see the point you’re trying to make.

¹: They might provide capability to discover the exact size of memory allocated, which in my eyes, is perhaps a necessary hole in abstraction.

alexcrichton · 2016-03-07T19:18:10Z

The purpose of this API is to basically outline the fundamentals of allocation for all Rust crates, and just because current implementations support a piece of functionality doesn't necessarily mean that's future proof in terms of being easily implementable in all possible allocators.

This sort of interface could certainly be added to the implementations which support it (eg alloc_jemalloc and alloc_system), but I don't think there's any need for this to be part of the fundamental underpinnings for all allocators in Rust.

I'd be fine just adding more Rust APIs to the alloc_jemalloc crate which Servo can use manually to funnel allocations through.

bluss · 2016-03-07T23:04:22Z

A function that reports the actual allocated size is much less of a commitment since it can be documented to return zero when the information is not available (of course using the best possible information when possible though, to help servo).

alexcrichton · 2016-03-08T22:05:47Z

Perhaps, but adding bindings to alloc_{jemalloc,system} is even less of a commitment as they're just exporting what the implementation is already doing.

alexcrichton · 2016-05-01T18:06:31Z

Closing due to inactivity, but feel free to resubmit with a rebase!

rust-highfive assigned nikomatsakis Feb 29, 2016

michaelwu mentioned this pull request Feb 29, 2016

Hook spidermonkey into rust's allocator servo/mozjs#72

Closed

alexcrichton assigned alexcrichton and unassigned nikomatsakis Mar 3, 2016

alexcrichton closed this May 1, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add __rust_unsized_deallocate for C libraries to hook into. #31976

Add __rust_unsized_deallocate for C libraries to hook into. #31976

michaelwu commented Feb 29, 2016

rust-highfive commented Feb 29, 2016

sfackler commented Feb 29, 2016

michaelwu commented Feb 29, 2016

alexcrichton commented Feb 29, 2016

michaelwu commented Mar 1, 2016

huonw commented Mar 1, 2016

alexcrichton commented Mar 1, 2016

michaelwu commented Mar 1, 2016

nagisa commented Mar 1, 2016

michaelwu commented Mar 1, 2016

michaelwu commented Mar 1, 2016

nagisa commented Mar 1, 2016

alexcrichton commented Mar 2, 2016

michaelwu commented Mar 2, 2016

nagisa commented Mar 2, 2016

michaelwu commented Mar 2, 2016

alexcrichton commented Mar 3, 2016

michaelwu commented Mar 3, 2016

nagisa commented Mar 4, 2016

nox commented Mar 7, 2016

nagisa commented Mar 7, 2016

nox commented Mar 7, 2016

nagisa commented Mar 7, 2016

alexcrichton commented Mar 7, 2016

bluss commented Mar 7, 2016

alexcrichton commented Mar 8, 2016

alexcrichton commented May 1, 2016

Add __rust_unsized_deallocate for C libraries to hook into. #31976

Add __rust_unsized_deallocate for C libraries to hook into. #31976

Conversation

michaelwu commented Feb 29, 2016

rust-highfive commented Feb 29, 2016

sfackler commented Feb 29, 2016

michaelwu commented Feb 29, 2016

alexcrichton commented Feb 29, 2016

michaelwu commented Mar 1, 2016

huonw commented Mar 1, 2016

alexcrichton commented Mar 1, 2016

michaelwu commented Mar 1, 2016

nagisa commented Mar 1, 2016

michaelwu commented Mar 1, 2016

michaelwu commented Mar 1, 2016

nagisa commented Mar 1, 2016

alexcrichton commented Mar 2, 2016

michaelwu commented Mar 2, 2016

nagisa commented Mar 2, 2016

michaelwu commented Mar 2, 2016

alexcrichton commented Mar 3, 2016

michaelwu commented Mar 3, 2016

nagisa commented Mar 4, 2016

nox commented Mar 7, 2016

nagisa commented Mar 7, 2016

nox commented Mar 7, 2016

nagisa commented Mar 7, 2016

alexcrichton commented Mar 7, 2016

bluss commented Mar 7, 2016

alexcrichton commented Mar 8, 2016

alexcrichton commented May 1, 2016