-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add __rust_unsized_deallocate for C libraries to hook into. #31976
Conversation
__rust_deallocate requires a size to be passed in, especially when jemalloc is used. Most C/C++ libraries don't use sized deallocation, so this allows non-rust code to hook into Rust's allocator without changing every free() call.
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nikomatsakis (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
Can regions allocated with the sized API be deallocated with the unsizzed API? I wouldn't expect that they could - isn't the point of sized allocation that the allocator can omit the size header? |
All allocations are sized, and I don't see a way to request that jemalloc not store metadata about a particular allocation. As I understand it, the sized deallocation API is an optimization to avoid looking up the metadata, rather than not storing metadata at all. AIUI modern allocators do not use size headers either. |
I think that this may be the wrong use of these libraries, the double-underscore in front of these symbols really is intended to mean "you should not call these functions", and it's true that you should not call these functions or rely on their existence. In terms of functionality, this also seems like it may not be the intended use case of these functions either? Could you elaborate a bit on why external languages are being fused to the Rust allocator? The current intention is that it'd go the other way around with this support existing to fuse Rust to some other allocator. |
Using double underscore functions is a bit weird, but it's required even if you wanted to plug in an allocator into rust rather than the other way around - https://doc.rust-lang.org/book/custom-allocators.html#writing-a-custom-allocator . It seems like we already rely on the existence of these symbols. As to attaching external language to Rust's allocator - I think it's not so unusual when you look at C/C++ libraries. A large number of libraries - zlib, libpng, libjpeg, and of course, spidermonkey, provide ways to attach them to custom allocators. They don't really have their own allocator, just some default stubs that glue into libc if nothing else is specified. If we don't have an appropriate API here to plug into, then everyone who wants to make sure their rust code is sharing the same allocator as C/C++ code has to provide their own allocator. |
As @sfackler says, allowing "standard" use of non-sized deallocation seems unfortunate: without that possibility, Rust could theoretically use an allocator that basically stores no information about allocated pointers internally (jemalloc has internal side tables, instead of headers), relying on the fact that we always pass back in a valid pointer along with its size. It seems like this might be possible via a custom allocator that exposes this sort of function itself? |
I suspect that any stable form of this API would hide this implementation detail. It's not intended to be a public thing that's typed.
I agree with @huonw here that this sounds like a case for shims rather than a new symbol that can be hard-wired to. |
So what's the alternative? It would be best to have all code use the same allocator, and there isn't much of a reason for most people to use anything other than what rust comes with. It seems broken to require a custom allocator in order to get code to share allocators. I can't think of how a reasonable allocator that stores no metadata could exist. Doing better than a size header is possible, and is already done, but at some point you need to know what's allocated. |
You can expect most Rust code to reasonably remember the sizes of allocations by itself. E.g. I share my surprise these symbols exist in the first place and I share the sentiment about unsized allocation/deallocation functions not being useful for the Rust itself. |
Sure, but when it comes time to make a new allocation, the allocator need to know what memory is and isn't being used, which requires the allocator to have some idea of how big any particular allocation is. |
Yeah Rust doesn't need this. This is for C/C++ libraries, so I didn't add anything to liballoc. |
It seems to me this is for |
@michaelwu the alternative is what @huonw mentioned which is to build that allocation on top of what the underlying API already gives you. |
And store the size for every pointer? That seems like a very poor solution. The alternative is to look up the size of the pointer, but no API exists for that either. |
Box deallocation code uses pretty much an equivalent of https://gist.github.com/89b0e226909482bd935a to “lookup” the size. |
Yeah but the free API we want to implement takes untyped void* pointers. That particular strategy doesn't work here. In SM, I've already converted frees of typed pointers into sized frees where it works reasonably (which is most of it!). That covers nearly everything, but then there are weird classes that are manually allocated and things that are in the public API which are less trivial to convert to sized allocation. |
@michaelwu basically, yeah. This is the API for all allocators in Rust, and it also seems unfortunate two force support for two modes of allocating? |
There isn't two modes of allocating though. |
And some toy allocator somebody implements at some point in the future won’t. Instead it would rely on Rust always passing in correct sizes. With this change in place allocators like that won’t be possible anymore. |
We (Servo) do need to know the actual allocated size of pointers and boxes and things like this. Cf the heapsize crate and #32075.
And then you underestimate all memory usage, because that doesn't account for jemalloc rounding up, or the Windows Heap API shenanigans with alignment. |
I could get behind this need/want. We’ve also wanted something similar in the standard library for
Isn’t that something the allocation library should be abstracting itself? I’d be seriously surprised if |
I am not sure what you mean by leaking memory. What I was referring to is that both jemalloc and Windows' Heap API require to overallocate in some cases, |
Yes, they do. However, Rust won’t request more than strictly necessary (i.e. it would call ¹: They might provide capability to discover the exact size of memory allocated, which in my eyes, is perhaps a necessary hole in abstraction. |
The purpose of this API is to basically outline the fundamentals of allocation for all Rust crates, and just because current implementations support a piece of functionality doesn't necessarily mean that's future proof in terms of being easily implementable in all possible allocators. This sort of interface could certainly be added to the implementations which support it (eg I'd be fine just adding more Rust APIs to the |
A function that reports the actual allocated size is much less of a commitment since it can be documented to return zero when the information is not available (of course using the best possible information when possible though, to help servo). |
Perhaps, but adding bindings to alloc_{jemalloc,system} is even less of a commitment as they're just exporting what the implementation is already doing. |
Closing due to inactivity, but feel free to resubmit with a rebase! |
__rust_deallocate requires a size to be passed in, especially when jemalloc is used. Most C/C++ libraries don't use sized deallocation, so this allows non-rust code to hook into Rust's allocator without changing every free() call.
Haven't tested this beyond building yet, but I'm putting this PR out first to see if the idea sounds ok. I'm adding sized allocation support to Spidermonkey https://bugzilla.mozilla.org/show_bug.cgi?id=1250998 but not everything can easily be converted to sized deallocation. The _rust* functions are being used to hook into rust's allocator.