-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare global allocators for stabilization #1974
Changes from 1 commit
df69ec1
56d6820
770abea
d80313c
c405fd9
b60f63b
22fe7cb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,248 @@ | ||
- Feature Name: `allocator` | ||
- Start Date: 2017-02-04 | ||
- RFC PR: | ||
- Rust Issue: | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Overhaul the global allocator APIs to put them on a path to stabilization, and | ||
switch the default allocator to the system allocator when the feature | ||
stabilizes. | ||
|
||
This RFC is a refinement of the previous [RFC 1183][]. | ||
|
||
[RFC 1183]: https://github.com/rust-lang/rfcs/blob/master/text/1183-swap-out-jemalloc.md | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
## API | ||
|
||
The unstable `allocator` feature allows developers to select the global | ||
allocator which will be used in a program. A crate identifies itself as an | ||
allocator with the `#![allocator]` annotation, and declares a number of | ||
allocation functions with specific `#[no_mangle]` names and a C ABI. To | ||
override the default global allocator, a crate simply pulls an allocator in | ||
via an `extern crate`. | ||
|
||
There are a couple of issues with the current approach: | ||
|
||
A C-style ABI is error prone - nothing ensures that the signatures are correct, | ||
and if a function is omitted that error will be caught by the linker rather than | ||
compiler. The Macros 1.1 API is similar in that certain special functions must | ||
be identified to the compiler, and in that case a special attribute | ||
(`#[proc_macro_derive]`)is used rather than a magic symbol name. | ||
|
||
Since an allocator is automatically selected when it is pulled into the crate | ||
graph, it is painful to compose allocators. For example, one may want to create | ||
an allocator which records statistics about active allocations, or adds padding | ||
around allocations to attempt to detect buffer overflows in unsafe code. To do | ||
this currently, the underlying allocator would need to be split into two | ||
crates, one which contains all of the functionality and another which is tagged | ||
as an `#![allocator]`. | ||
|
||
## jemalloc | ||
|
||
Rust's default allocator has historically been jemalloc. While jemalloc does | ||
provide significant speedups over certain system allocators for some allocation | ||
heavy workflows, it has has been a source of problems. For example, it has | ||
deadlock issues on Windows, does not work with Valgrind, adds ~300KB to | ||
binaries, and has caused crashes on macOS 10.12. See [this comment][] for more | ||
details. As a result, it is already disabled on many targets, including all of | ||
Windows. While there are certainly contexts in which jemalloc is a good choice, | ||
developers should be making that decision, not the compiler. The system | ||
allocator is a more reasonable and unsurprising default choice. | ||
|
||
A third party crate allowing users to opt-into jemalloc would also open the door | ||
to provide access to some of the library's other features such as tracing, arena | ||
pinning, and diagnostic output dumps for code that depends on jemalloc directly. | ||
|
||
[this comment]: https://github.com/rust-lang/rust/issues/36963#issuecomment-252029017 | ||
|
||
# Detailed design | ||
[design]: #detailed-design | ||
|
||
## Defining an allocator | ||
|
||
An allocator crate identifies itself as such by applying the `#![allocator]` | ||
annotate at the crate root. It then defines a specific set of functions which | ||
are tagged with attributes: | ||
|
||
```rust | ||
#![allocator] | ||
|
||
/// Returns a pointer to `size` bytes of memory aligned to `align`. | ||
/// | ||
/// On failure, returns a null pointer. | ||
/// | ||
/// Behavior is undefined if the requested size is 0 or the alignment is not a | ||
/// power of 2. The alignment must be no larger than the largest supported page | ||
/// size on the platform. | ||
#[allocator(allocate)] | ||
pub fn allocate(size: usize, align: usize) -> *mut u8 { | ||
... | ||
} | ||
|
||
/// Returns a pointer to `size` bytes of memory aligned to `align`, and | ||
/// initialized with zeroes. | ||
/// | ||
/// On failure, returns a null pointer. | ||
/// | ||
/// Behavior is undefined if the requested size is 0 or the alignment is not a | ||
/// power of 2. The alignment must be no larger than the largest supported page | ||
/// size on the platform. | ||
#[allocator(allocate_zeroed)] | ||
pub fn allocate_zeroed(size: usize, align: usize) -> *mut u8 { | ||
... | ||
} | ||
|
||
/// Deallocates the memory referenced by `ptr`. | ||
/// | ||
/// The `ptr` parameter must not be null. | ||
/// | ||
/// The `old_size` and `align` parameters are the parameters that were used to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ruuda made a good point in the discussion of the allocator traits: It can be sensible to allocate over-aligned data, but this information is not necessarily carried along until deallocation, so there's a good reason This requirement was supposed to allow optimizations in the allocator, but AFAIK nobody could name a single existing allocator design that can use alignment information for deallocation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wrote an allocator for an OS kernel once that would have benefited greatly from alignment info. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That would be very relevant to both this RFC and the allocators design, so could you write up some details? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm... It seems that I was very mistaken... I have to appologize 🤕 Actually, when I went back and looked at the code, I found the exact opposite. The allocator interface actually does pass the alignment to The code is here. It's a bit old and not very well-written since I was learning rust when I wrote it. Here is a simple description of what it does: Assumptions
ObjectiveUse as little metadata as possible. Blocks
allocAllocating memory just grabs the first free block with required size and alignment, removes it from the free list, splits it if needed, and returns a pointer to its beginning. The size of the block allocated is a function of the alignment and size. freeFreeing memory requires very little effort, it turns out. Since we assume that the parameters In fact, the alignment passed into free is ignored here because the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm actually quite relieved to hear that 😄 Yes, allocation and reallocation should have alignment arguments, it's just deallocation that shouldn't use alignment information. It's not quite true that " |
||
/// create the allocation referenced by `ptr`. | ||
#[allocator(deallocate)] | ||
pub fn deallocate(ptr: *mut u8, old_size: usize, align: usize) { | ||
... | ||
} | ||
|
||
/// Resizes the allocation referenced by `ptr` to `size` bytes. | ||
/// | ||
/// On failure, returns a null pointer and leaves the original allocation | ||
/// intact. | ||
/// | ||
/// If the allocation was relocated, the memory at the passed-in pointer is | ||
/// undefined after the call. | ||
/// | ||
/// Behavior is undefined if the requested size is 0 or the alignment is not a | ||
/// power of 2. The alignment must be no larger than the largest supported page | ||
/// size on the platform. | ||
/// | ||
/// The `old_size` and `align` parameters are the parameters that were used to | ||
/// create the allocation referenced by `ptr`. | ||
#[allocator(reallocate)] | ||
pub fn reallocate(ptr: *mut u8, old_size: usize, size: usize, align: usize) -> *mut u8 { | ||
... | ||
} | ||
|
||
/// Resizes the allocation referenced by `ptr` to `size` bytes without moving | ||
/// it. | ||
/// | ||
/// The new size of the allocation is returned. This must be at least | ||
/// `old_size`. The allocation must always remain valid. | ||
/// | ||
/// Behavior is undefined if the requested size is 0 or the alignment is not a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we go for "Behavior is undefined if the requested size is less than There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I just copied these docs out of |
||
/// power of 2. The alignment must be no larger than the largest supported page | ||
/// size on the platform. | ||
/// | ||
/// The `old_size` and `align` parameters are the parameters that were used to | ||
/// create the allocation referenced by `ptr`. | ||
/// | ||
/// This function is optional. The default implementation simply returns | ||
/// `old_size`. | ||
#[allocator(reallocate_inplace)] | ||
pub fn reallocate_inplace(ptr: *mut u8, old_size: usize, size: usize, align: usize) -> usize { | ||
... | ||
} | ||
``` | ||
|
||
Note that `useable_size` has been removed, as it is not used anywhere in the | ||
standard library. | ||
|
||
The allocator functions must be publicly accessible, but can have any name and | ||
be defined in any module. However, it is recommended to use the names above in | ||
the crate root to minimize confusion. | ||
|
||
An allocator must provide all functions with the exception of | ||
`reallocate_inplace`. New functions can be added to the API in the future in a | ||
similar way to `reallocate_inplace`. | ||
|
||
## Using an allocator | ||
|
||
The functions that an allocator crate defines can be called directly, but most | ||
usage will happen through the *global allocator* interface located in | ||
`std::heap`. This module exposes a set of functions identical to those described | ||
above, but that call into the global allocator. To select the global allocator, | ||
a crate declares it via an `extern crate` annotated with `#[allocator]`: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Clarification request: Can all crates do this? As mentioned in another comment, I would conservatively expect this choice to be left to the root crate, as with panic runtimes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As written, any crate can do this, yeah. I would be fine restricting allocator selection to the root crate if it simplifies the implementation - I can't think of any strong reasons for needing to select an allocator in a non-root crate. |
||
|
||
```rust | ||
#[allocator] | ||
extern crate jemalloc; | ||
``` | ||
|
||
As its name would suggest, the global allocator is a global resource - all | ||
crates in a dependency tree must agree on the selected global allocator. If two | ||
or more distinct allocator crates are selected, compilation will fail. Note that | ||
multiple crates can select a global allocator as long as that allocator is the | ||
same across all of them. In addition, a crate can depend on an allocator crate | ||
without declaring it to be the global allocator by omitting the `#[allocator]` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it make sense to restrict this choice to "root crates" (executables, staticlibs, cdylibs) analogously to how the panic strategy is chosen? [1] I can't think of a good reason for a library to require a particular allocator, and it seems like it could cause a ton of pain (and fragmentation) to mix multiple allocators within one application. [1]: It's true that the codegen option There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I share this concern. Allowing libraries to require a particular global allocator could create rifts in the crate ecosystem, where different sets of libraries cannot be used together because they require different global allocators. Allocators share the same interface, and so the optimal allocator will depend on the workload of the binary. It seems like the crate root author will be in the best position to make this choice, since they'll have insight into the workload type, as well as be able to run holistic benchmarks. Thus is seems like a good idea to restrict global allocator selection to the crate root author. |
||
annotation. | ||
|
||
## Standard library | ||
|
||
The standard library will gain a new stable crate - `alloc_system`. This is the | ||
default allocator crate and corresponds to the "system" allocator (i.e. `malloc` | ||
etc on Unix and `HeapAlloc` etc on Windows). | ||
|
||
The `alloc::heap` module will be reexported in `std` and stabilized. It will | ||
simply contain functions matching directly to those defined by the allocator | ||
API. The `alloc` crate itself may also be stabilized at a later date, but this | ||
RFC does not propose that. | ||
|
||
The existing `alloc_jemalloc` may continue to exist as an implementation detail | ||
of the Rust compiler, but it will never be stabilized. Applications wishing to | ||
use jemalloc can use a third-party crate from crates.io. | ||
|
||
# How We Teach This | ||
[how-we-teach-this]: #how-we-teach-this | ||
|
||
The term "allocator" is the canonical one for this concept. It is unfortunately | ||
shared with a similar but distinct concept described in [RFC 1398][], which | ||
defined an `Allocator` trait over which collections be parameterized. This API | ||
is disambiguated by referring specifically to the "global" or "default" | ||
allocator. | ||
|
||
Global allocator selection would be a somewhat advanced topic - the system | ||
allocator is sufficient for most use cases. It is a new tool that developers can | ||
use to optimize for their program's specific workload when necessary. | ||
|
||
It should be emphasized that in most cases, the "terminal" crate (i.e. the bin, | ||
cdylib or staticlib crate) should be the only thing selecting the global | ||
allocator. Libraries should be agnostic over the global allocator unless they | ||
are specifically designed to augment functionality of a specific allocator. | ||
|
||
Defining an allocator is an even more advanced topic that should probably live | ||
in the _Nomicon_. | ||
|
||
[RFC 1398]: https://github.com/rust-lang/rfcs/pull/1398 | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
Dropping the default of jemalloc will regress performance of some programs until | ||
they manually opt back into that allocator, which may produce confusion in the | ||
community as to why things suddenly became slower. | ||
|
||
The allocator APIs are to some extent designed after what jemalloc supports, | ||
which is quite a bit more than the system allocator is able to. The Rust | ||
wrappers for those simpler allocators have to jump through hoops to ensure that | ||
all of the requirements are met. | ||
|
||
# Alternatives | ||
[alternatives]: #alternatives | ||
|
||
We could require that at most one crate selects a global allocator in the crate | ||
graph, which may simplify the implementation. | ||
|
||
The allocator APIs could be simplified to a more "traditional" | ||
malloc/calloc/free API at the cost of an efficiency loss when using allocators | ||
with more powerful APIs. | ||
|
||
# Unresolved questions | ||
[unresolved]: #unresolved-questions | ||
|
||
It is currently forbidden to pass a null pointer to `deallocate`, though this is | ||
guaranteed to be a noop with libc's `free` at least. Some kinds of patterns in C | ||
are cleaner when null pointers can be `free`d - is the same true for Rust? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now out of date, as we are just identifying the static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, will fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, no, this section is a description of the API as it exists today, not as it will exist in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sfackler I was also confused by this when I first read the section. Would it be possible to relabel the section as "What we have now" or something?