Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocator integration #42313

Merged
merged 10 commits into from
Jun 20, 2017
Merged

Allocator integration #42313

merged 10 commits into from
Jun 20, 2017

Conversation

pnkfelix
Copy link
Member

@pnkfelix pnkfelix commented May 30, 2017

Lets start getting some feedback on trait Alloc.

Here is:

  • the trait Alloc itself,
  • the struct Layout and enum AllocErr that its API relies on
  • a struct HeapAlloc that exposes the system allocator as an instance of Alloc
  • an integration of Alloc with RawVec
  • an integration of Alloc with Vec

TODO

  • split fn realloc_in_place into grow and shrink variants
  • add # Unsafety and # Errors sections to documentation for all relevant methods
  • remove Vec integration with Allocator
  • add allocate_zeroed impl to HeapAllocator
  • remove typedefs e.g. type Size = usize;
  • impl trait Error for all error types in PR
  • make Layout::from_size_align public
  • clarify docs of fn padding_needed_for.
  • revise Layout constructors to ensure that size+align combination is valid
  • resolve mismatch re requirements of align on dealloc. See comment.

@rust-highfive
Copy link
Collaborator

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@pnkfelix pnkfelix mentioned this pull request May 30, 2017
12 tasks
@pnkfelix pnkfelix force-pushed the allocator-integration branch 2 times, most recently from 971fd63 to 9f69470 Compare May 31, 2017 14:29
@aidanhs
Copy link
Member

aidanhs commented Jun 1, 2017

@pnkfelix while this is waiting for review, looks like there are some failures with debuginfo tests which need updating e.g.

[00:45:43] error: line not found in debugger output: $1 = Vec<i32>(len: [...], cap: [...])[...]
[...]
[00:45:43] $1 = Vec<i32, alloc::heap::HeapAllocator>(len: 140737488348824, cap: 140737488348832) = {1431658224, 21845, -8392704, 32767, 8388608, 0, 1433759768, 0, -6976, 32767, -6160, 32767, 0, 0, 0, 0, -157155296, 32767, 1, 0, 1, 0, 0, 0, 0, 0, -140028416, 32767, -6448, 32767, -136386733, 32767, 0, 0, 0, 0, -157155296, 32767, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 134217732, 0, 0, 0, 0, 0, 0, 0, -6668, 32767, -138951296, 32767, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 116, 0, 116, 0, 0, 0, -6572, 32767, -138951296, 32767, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, -6152, 32767, 0, 0, 0, 0, 1431658512, 21845, -136415008, 32767, 0, 0, 1431654144, 21845, -6160, 32767, 0, 0, 0, 0, -6384, 32767, 1431658387, 21845, 1431658400, 21845, 1431654144, 16799061, -6152, 32767, 1, 0, 1431658400, 21845, -146085840, 32767, 0, 0, -6152, 32767, 0, 1, 1431658336, 21845, 0, 0, -1848772228, 91821683, 1431654144, 21845, -6160, 32767, 0, 0, 0, 0, -523372164, 1345078054, -513934980, 1345081932, 0, 0, 0, 0, 0, 0, 1, 0, 1431658336, 21845, 1431658512, 21845, 0, 0, 0, 0, 1431654144, 21845, -6160, 32767, 0, 0, 1431654185, 21845...}
[00:45:43] A debugging session is active.

@aidanhs aidanhs added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 1, 2017
Copy link
Contributor

@gereeter gereeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really happy to see this, and it looks like the integration works quite smoothly, which is great. I certainly don't see any fundamental issues.

I'm a teensy bit disappointed (but not surprised) to not see my grow/shrink suggestion (comment); should I submit an RFC for that change?

ptr: Address,
layout: Layout,
new_layout: Layout) -> Result<(), CannotReallocInPlace> {
let (_, _, _) = (ptr, layout, new_layout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this check usable_size and allow the reallocation if the new layout fits within the usable size of the old layout? See also my comment here, which seemed fairly uncontroversial.

let s = new_layout.size();
// All Layout alignments are powers of two, so a comparison
// suffices here (rather than resorting to a `%` operation).
if min <= s && s <= max && new_layout.align() <= layout.align() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this instead just call realloc_in_place and rely on that to do the usable_size check? See also my comment here, which seemed fairly uncontroversial.

let size = size as isize;
let p = self.alloc(layout);
if let Ok(p) = p {
for i in 0..size {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this instead use core::ptr::write_bytes? That should be more efficient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes I briefly skimmed through the core::ptr source but didn't look at the reexports, so I missed this. Will fix.

/// not a strict requirement. (Specifically: it is *legal* to use
/// this trait to wrap an underlying native allocation library
/// that aborts on memory exhaustion.)
unsafe fn alloc(&mut self, layout: Layout) -> Result<Address, AllocErr>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see this mentioned in the RFC discussion, but it occurs to me... why should this be unsafe? The main reason I can see is that allocators can assume that the layouts they are passed are non-zero, but that ought to be specified in the documentation for alloc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, that seems more like something to consider for alloc_unchecked. Should alloc return Unsupported for zero-sized allocations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I personally don't think it is a burden to make fn alloc an unsafe method, since most clients will either need to immediately use unsafe code to initialize the allocated state, or will be passing along an uninitialized block of memory which means that they themselves will probably be unsafe fn's as well...

But I agree that its important to spell out all of the criteria for what safe input arguments are, e.g. requiring layouts have non-zero size...

Copy link
Member Author

@pnkfelix pnkfelix Jun 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now that I've gone and read the man page for malloc and jemalloc more carefully, I no longer think that passing a size of zero is undefined behavior...? It says its implementation defined whether you get back a null or a valid non-null address (that you're nonetheless not allowed to dereference), but I don't think it allows for arbitrary UB, does it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(still, even if its not UB, it doesn't actually solve our own problem, since the alloc::heap implementation of fn check_size_and_alignment includes a debug assert that size != 0, so we would need to either side step that or include an appropriate check up here...)

/// Creates a layout describing the record for `n` instances of
/// `self`, with a suitable amount of padding between each.
///
/// Requires non-zero `n` and no arithmetic overflow from inputs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requirement for non-zero n is not checked anywhere, even in the (checked) repeat function. It doesn't really seem necessary (and is counterintuitive for me) that 0 is not allowed.

Copy link
Member Author

@pnkfelix pnkfelix Jun 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to make a choice about where we are going to attempt to enforce that people do not make zero-sized memory requests to the allocator.

Given how many iterations the API for Layout and Allocator went through, at different times I had different models in my head about whose responsibility it was to ensure that the underlying system allocator did not receive requests for zero-sized memory...

anyway, I'll try to resolve this. not 100% sure which way I'll go (I don't think I have a clear "right direction" from prior discussions on the RFC thread...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(At this point I'm leaning towards "Layout can have zero size; it is responsibility of client to ensure either that Layout it passes to an allocator has positive size, or that it is using an allocator that can handle zero-sized requests.)

let old_size = layout.size();
let result = self.alloc(new_layout);
if let Ok(new_ptr) = result {
ptr::copy(ptr as *const u8, new_ptr, cmp::min(old_size, new_size));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be copy_nonoverlapping. It would be an invalid allocator implementation to have two outstanding allocations that overlap.

/// are set to zero before being returned.
unsafe fn alloc_zeroed(&mut self, layout: Layout) -> Result<Address, AllocErr> {
let size = layout.size();
if size > ::core::isize::MAX as usize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this check here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because one can currently construct a Layout that overflows isize?

I suppose I could instead just say that the state of the contents are unspecified if the layout size overflows isize. Does that seem preferable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah it doesn't matter with the switch to ptr::write_bytes, yay!

let new_align = cmp::max(self.align, next.align);
let realigned = Layout { align: new_align, ..*self };
let pad = realigned.padding_needed_for(new_align);
let offset = self.size() + pad;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably use checked_add (and below).

pub fn padding_needed_for(&self, align: Alignment) -> usize {
debug_assert!(align <= self.align());
let len = self.size();
let len_rounded_up = (len + align - 1) & !(align - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably, this should actually be wrapping arithmetic? That will produce the right answer in all cases, even when len + align overflows.

ptr: Address,
layout: Layout,
new_layout: Layout) -> Result<Excess, AllocErr> {
let usable_size = self.usable_size(&new_layout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other possible default implementation would be to copy the realloc default implementation, but use alloc_excess instead of alloc. This would work better if the default realloc algorithm were used (and alloc_excess was more precise than usable_size). However, I don't think it's worth it, since people are probably more likely to just override realloc and forget about realloc_excess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think this would only win if you were hooking in an allocator that both

  1. overrides fn alloc_excess, and
  2. does not override fn realloc_excess.

Probably better to just advise allocator implementators to understand what the default implementations do when deciding which methods to override.

if usable >= new_layout.size() { Ok(()) } else { Err(CannotReallocInPlace) }
}

unsafe fn alloc_unchecked(&mut self, layout: Layout) -> Option<*mut u8> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why not Option<NonZero<*mut u8>> ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the early RFC drafts used NonZero, but there was concern about exposing that as part of the API for this type. See e.g. aturon's comment here

@pnkfelix
Copy link
Member Author

pnkfelix commented Jun 1, 2017

@gereeter wrote:

I'm a teensy bit disappointed (but not surprised) to not see my grow/shrink suggestion (comment); should I submit an RFC for that change?

No, I just forgot to incorporate it into this draft. But it is on the checklist on the description for #32838 and I'll try to put it in while I address the other great feedback you have posted here.

@pnkfelix pnkfelix force-pushed the allocator-integration branch from f21b714 to d5baa4e Compare June 1, 2017 15:15
@aturon
Copy link
Member

aturon commented Jun 1, 2017

cc @rust-lang/libs

@sfackler @alexcrichton @gereeter any of you want to be official reviewer here?

@pnkfelix pnkfelix added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 2, 2017
@alexcrichton alexcrichton assigned alexcrichton and unassigned aturon Jun 2, 2017
@alexcrichton
Copy link
Member

Sure, I don't mind reviewing!

I read over this and the high-level comments I had were:

  • Doc-wise I think this may benefit from the standard # Unsafety and # Errors section. Basically anything that's unsafe should have # Unsafety explaining why and anything that returns a Result should have an # Errors section explaining what errors can be returned.
  • Could you remind me why &mut self is needed for allocators? For global allocators we want &self and I noticed recently that an Arena type on crates.io also uses &self. I'm forgetting the precise details but just wanted to refresh my memory.
  • I wonder if we may wish to hold off on changing Vec just yet? I think it's great to change RawVec while we're here but we may just wish to take it slowly in pushing this through to other collections while we make sure everything else looks ok. The debugger stuff breaking here is just one possible example of what I'm thinking.
  • In the meantime we added allocate_zeroed to the global alloc::heap API, perhaps it could be implemented on HeapAllocator?

Some API nits I also had are below. I'm ambivalent about whether we must resolve these questions before landing.

  • I'm not personally a huge fan of typedefs like type Size = usize as it mostly just confuses me each time to go back and read the original location. These are publicly exposed as well from the API. I'd personally just throw usize and *mut u8 everywhere, but curious what others think!
  • I was thinking that the realloc function was trying a little too hard by default, perhaps it could just immediately fall back to alloc/dealloc? Or is there a use case for implementing the other functions and not reimplementing realloc?
  • I found CannotReallocInPlace a little odd, perhaps a normal AllocErr could just be returned?
  • Could the Error trait be implemented for all error types?
  • Could Layout::from_size_align get exposed? That's what's proposed being used in the global allocator RFC as well.
  • Does padding_needed_for need to take an argument? I originally thought, when reading the documentation and function name, that you wouldn't pass an alignment but would instead just learn about the current padding in the Layout.
  • Does usable_size need to return a minimum? I sort of thought it would return a Layout originally. Is there a use case for asking for some memory but then actually using less?

@gereeter
Copy link
Contributor

gereeter commented Jun 3, 2017

Could you remind me why &mut self is needed for allocators? For global allocators we want &self and I noticed recently that an Arena type on crates.io also uses &self. I'm forgetting the precise details but just wanted to refresh my memory.

To quote the RFC text:

The justification for &mut self is this:

  • It does not restrict allocator implementors from making sharable allocators: to do so, just do impl<'a> Allocator for &'a MySharedAlloc, as illustrated in the DumbBumpPool example.

  • &mut self is better than &self for simple allocators that are not sharable. &mut self ensures that the allocation methods have exclusive access to the underlying allocator state, without resorting to a lock. (Another way of looking at it: It moves the onus of using a lock outward, to the allocator clients.)

  • One might think that the points made above apply equally well to self (i.e., if you want to implement an allocator that wants to take itself via a &mut-reference when the methods take self, then do impl<'a> Allocator for &'a mut MyUniqueAlloc).

    However, the problem with self is that if you want to use an allocator for more than one allocation, you will need to call clone() (or make the allocator parameter implement Copy). This means in practice all allocators will need to support Clone (and thus support sharing in general, as discussed in the Allocators and lifetimes section).

    (Remember, I'm thinking about allocator-parametric code like Vec<T, A:Allocator>, which does not know if the A is a &mut-reference. In that context, therefore one cannot assume that reborrowing machinery is available to the client code.)

    Put more simply, requiring that allocators implement Clone means that it will not be pratical to do impl<'a> Allocator for &'a mut MyUniqueAlloc.

    By using &mut self for the allocation methods, we can encode the expected use case of an unshared allocator that is used repeatedly in a linear fashion (e.g. vector that needs to reallocate its backing storage).

I was thinking that the realloc function was trying a little too hard by default, perhaps it could just immediately fall back to alloc/dealloc? Or is there a use case for implementing the other functions and not reimplementing realloc?

  • Why wouldn't you want the default implementation to be as efficient as possible? There should be no overhead for implementations that don't implement usable_size, grow_in_place, or shrink_in_place, since the comparisons can all be optimized out.

  • There absolutely is a reason for implementing other functions and not reimplementing realloc. In fact, I can't see why any allocator would ever override realloc unless it was calling mmap and wanted to use mremap for reallocation, which can shuffle pages around. Consider:

    • An arena or other form of bump allocator can do nothing special on realloc. However, if they have some minimum alignment, they might very well have a nontrivial implementation of usable_size, which currently leads to good savings on realloc by default.
    • Any allocator based on size classes (basically all general purpose allocators, fixed size freelist allocators ("refrigerators"), buddy allocators) have nontrivial usable_size, but can rarely do anything interesting in realloc besides trying to reallocate in place. Even general purpose allocators that rely on mmap probably can't do anything, since mmap only works with a page granularity.
    • Buddy allocators support reallocating in place quite well, but if that isn't an option, alloc and dealloc is the way to go.
    • At least jemalloc doesn't like using mremap anyway, since it increases fragmentation.

    Essentially, I see the default implementation here as what you want in basically every case.

I found CannotReallocInPlace a little odd, perhaps a normal AllocErr could just be returned?

Returning Exhausted doesn't make sense, as you generally have more memory, just not in that location. Returning Unsupported doesn't make sense, as what is unsupported for in-place allocation may be totally different for what is unsupported for the allocator as a whole. Additionally, failure in this case doesn't really seem like an "error", per se. We expect failure on a regular basis, and that doesn't have any implication on future calls, unlike Exhaused and Unsupported.

More minorly, I'm a little worried about efficiency - I see the in-place methods as quick checks that are done first, mostly, and so those don't want a complex error reporting system.

Does padding_needed_for need to take an argument? I originally thought, when reading the documentation and function name, that you wouldn't pass an alignment but would instead just learn about the current padding in the Layout.

I may be misunderstanding what you mean here, but the amount of padding you need to hit a given alignment will increase with the alignment of the parameter - e.g., an i8 needs 1 byte of padding to be followed by an i16, but it needs 3 bytes of padding to be followed by an i32. The method can't just return the tail padding we'd usually include in structs to pad a Layout to its own alignment, since padding_needed_for is used in extend, which follows with an arbitrary Layout.

Does usable_size need to return a minimum? I sort of thought it would return a Layout originally. Is there a use case for asking for some memory but then actually using less?

So, this is interesting because reading your comment I realized that my intuition was wrong. As far as I understand it, the minimum returned from usable_size is not the minimum size that you can use (because obviously you can do whatever you want with your memory, including not using all of it). Rather, it returns the minimum size that you can still validly pass into dealloc for the same allocation. Put another way, it returns the next smaller size class of the allocator (+1 byte? the fact that this bound is inclusive is annoying, but since you want to return 0 sometimes, exclusive would be worse).

The reasonable case I can think of is in something like Vec::shrink_in_place, where you don't really want to copy over the whole array if it won't actually save memory. If I want to make sure I don't have random excess memory usage (or, more likely in the relevant small cases, I just don't want to store a capacity), then it makes sense to call this unconditionally after building my Vec. However, if the Vec's growth pattern matches the allocator's growth pattern (reasonably likely), then a copy is useless, since the newly allocated vector will take up the same amount of memory.

See also the motivation comment for including the minimum.

if let Ok(()) = self.grow_in_place(ptr, layout.clone(), new_layout.clone()) {
return Ok(ptr);
}
} else if new_size < old_size {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems unfortunate that if new_size == old_size then this always allocates (when allocating is never necessary). Presumably either one of these two cases should cover that (my vote) or there should be a final else branch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, good point. I know I was also non-plussed by the handling of same size inputs, but cannot remember why I didn't just resolve it in the manner you suggest. (Apart from just being unused to the new grow/shrink methods and possibly forgetting if they required the new size to be strictly greater/less than the old one...)

}
}

/// The `CannotReallocInPlace` error is used when `fn realloc_in_place`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of style, realloc_in_place has now been replaced by grow_in_place and shrink_in_place.

}
}

unsafe fn grow_in_place(&mut self,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the heap allocator should have an identical implementation of shrink_in_place?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I had thought that there was some reason that the spec for reallocate_inplace made it hard to detect error. But now I don't know exactly why I had reached that conclusion. Will review.

pub struct HeapAllocator;

unsafe impl Allocator for HeapAllocator {
unsafe fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a reason that this says *mut u8 instead of Address?

new_layout: Layout) -> Result<(), CannotReallocInPlace> {
let _ = ptr; // this default implementation doesn't care about the actual address.
debug_assert!(new_layout.size <= layout.size);
if new_layout.align != layout.align {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just require that new_layout.align == layout.align for defined behavior? I can't see that this would be hard for users to follow, and it makes sense to me (obviously you won't change the alignment if you don't change the pointer).

This would require adding the check in the default implementation of realloc, but that doesn't seem to be a problem.

We could just only require that new_layout.align <= layout.align, but I don't see that being useful for users, and it might be annoying for allocator implementations, since different alignments might result in different size classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a problem with requiring the old and new alignments to be the same.

// makes it hard to detect failure if it does not hold.
debug_assert!(new_layout.size() >= layout.size());

if layout.align() != new_layout.align() { // reallocate_inplace requires this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least, the comment here seems incorrect. reallocate_inplace doesn't even see new_layout.align().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but that's exactly the issue. reallocate_inplace doesn't take enough inputs to deal with distinct old and new alignments. It just says that its required to take the old alignment as its align parameter, and that it will produce a block that is aligned according to that.

So if you give this method distinct alignments for old and new, then reallocate_inplace is not going to be able to satisfy the request.

@pnkfelix
Copy link
Member Author

pnkfelix commented Jun 5, 2017

@alexcrichton I'll add todo's corresponding to your high-level comments to the PR description.

  • # Unsafetyand # Errors ==> will add a todo

  • Regarding &mut self, we hashed this out for quite a while on the RFC.

    • Also, alex and I discussed this on IRC; I hypothesize alex was concluding from the PR that the code here does not demonstrate any Allocator implementation that requires &mut self rather than &self.
    • (However, this would be an incorrect conclusion; the BoundedAlloc used in raw_vec::tests::allocator_param does require &mut self as it is currently written, and would have to be changed to use Cell or something if we moved to &self.)
    • In any case, it might be a good idea to come up with example Allocator implementations that more strongly motivate &mut self. But I'm not sure if that needs to be part of this PR.
  • Regarding the changes to Vec, I'm happy to remove Vec for now. Less for me to rebase. I just figured that RawVec alone might not serve as a sufficient example of allocator integration (i.e. sanity checking that adding the allocator parameter doesn't cause the world to break.) Anyway, the proof-of-concept has been put up here for the world to see, so that means I can remove it. :) ==> will add a todo

  • regarding allocate_zeroed ==> will add a todo.

  • regarding typedefs: I'll stop fighting the tide here and remove the typedefs. (They paid for themselves more effectively when they were e.g. type Size = NonZero(usize);) ==> will add a todo.

  • regarding realloc, originally it was just a simple wrapper around alloc/dealloc. But @gereeter convinced me to add the early fast paths to grow and shrink.

@pnkfelix
Copy link
Member Author

pnkfelix commented Jun 5, 2017

(continuing previous comment responding to @alexcrichton 's points...)

  • If you don't like CannotReallocInPlace, then I could be talked into returning Result<(), ()>. But I do not see a reason to return Result<(), AllocErr>. So what do you prefer: Err(CannotReallocInPlace, or Err(()) ? (Given parallel discussions in the lang team about automatically promoting () to Ok(()), I am inclined to continue with Result<(), CannotReallocInPlace>

  • Re impl Error trait for all error types ==> will add todo

  • Could Layout::from_size_align get exposed ==> will add todo

  • Does padding_needed_for need to take an alignment: I think @gereeter has done a good job of addressing why there is a separate argument here. (Reading the discussion between alex and @gereeter leads me to conclude that the documentation for padding_needed_for does indeed need work.)

  • usable_size definitely needs to return a minimum. You cannot demote blocks of memory to arbitrarily smaller size classes, as noted by @gereeter above, which was something that I overlooked for a long time when designing this API.

@pnkfelix pnkfelix added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 5, 2017
@pnkfelix
Copy link
Member Author

pnkfelix commented Jun 5, 2017

@gereeter wrote:

Rather, it returns the minimum size that you can still validly pass into dealloc for the same allocation. Put another way, it returns the next smaller size class of the allocator (+1 byte? the fact that this bound is inclusive is annoying, but since you want to return 0 sometimes, exclusive would be worse).

Hmm. While I personally think an inclusive bound is easy for clients, an exclusive bound would probably be fine for clients too, and would probably be less annoying for implementors... The point about returning 0 is perhaps the only one in favor of keeping the bound inclusive, but so much of the API already says that you're not supposed to pass in layouts of zero-size...

In any case, If we're going to revise this we should decide soon. Its the kind of corner-case that will be impossible to revise after the fact...

@pnkfelix
Copy link
Member Author

pnkfelix commented Jun 5, 2017

Reviewing the API and documentation, I am reminded that the current Allocator trait defines the notion of whether a block "fits" a layout; however, when defining what this means with respect to alignment, it just says: "The block's starting address must be aligned to layout.align()"

This was an intuitive definition, and in practice it may be fine, but it may or may not be "the right thing" when it comes to the requirements of arbitrary allocators.

In particular, it does not match the requirements of alloc::heap::deallocate, which says:

The old_size and align parameters are the parameters that were used to
create the allocation referenced by ptr. The old_size parameter may be
any value in range_inclusive(requested_size, usable_size).

I.e. this does not allow the align to vary from the one used at the time that the block of memory was requested. (The Allocator trait, as written, does allow this, since its definition of "fits" is based on the dynamic value of the returned address, which might by chance happen to conform to alignments that were not part of the original request.)

@ruuda has been pointing out for quite a while (here and elsewhere) that this is a potential problem with both the Allocator trait and the lower level custom global allocator.

Anyway, I point this out mostly to say that the current Allocator trait specification is probably too loose when compared to the global allocator atop which it is layered.

@alexcrichton
Copy link
Member

@gereeter

To quote the RFC text:

Thanks for the link! Gives me some extra fodder for global allocator traits.

There should be no overhead for implementations that don't implement usable_size, grow_in_place, or shrink_in_place, since the comparisons can all be optimized out.

That's a good point! My worry here, though, was that you'd do a lot of work to get a "no" on either the grow/shrink paths and then do more of the same work when you do the full reallocation. You later mentioned that it's pretty reasonable to not implement realloc at all which makes me continue to worry about this. There's presumably shared work between allocate and something like grow_in_place, right? If the grow_in_place function returns an error then that'd be duplicated by default on realloc?

I'd be ok with this default implementation if we'd assume that allocators which support grow/shrink in place also reimplement realloc, but it sounds like you're not expecting that? Do you think the "shared work" here is small enough to not matter?

Additionally, failure in this case doesn't really seem like an "error", per se. We expect failure on a regular basis, and that doesn't have any implication on future calls, unlike Exhaused and Unsupported.

Ok, you're reasoning makes sense to me! If it's normal to fail I wonder if we could consider just returning a bool? That seems to me like it may actually be a good fit for this API. The footgun here is that you forget to check the return value (which using Result will warn you about), but I'm not sure how often this is called in practice to warrant that?

In any case this'll likely be an unstable API for awhile and it's otherwise just "one more error type" so it's not that bad either way, I was just curious to get some more rationale written out.

So, this is interesting because reading your comment I realized that my intuition was wrong. ...

Ok all this about returning a minimum makes sense to me, thanks!


The last API question I had was about Layout::padding_needed_for. My main point was that the current implementation doesn't actually use self.align at all, so it seemed like it was odd that you passed in align versus using the stored alignment. I sort of thought you'd do something like layout.aligned(foo).padding_needed() or something like that.

The actual thing the function does makes total sense of me, I was basically just trying to rationalize why align was passed in and not used as self.align. Does that make sense?

/// padding required to get a 4-aligned address (assuming that the
/// corresponding memory block starts at a 4-aligned address).
///
/// Behavior undefined if `align` is not a power-of-two.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a bit strong for this method, specifically "behavior undefined" in the sense that this method isn't unsafe so it's technically not allowed to do undefined things for any input. Perhaps rewording this to something like:

The return value of this function has no meaning if align is not a power-of-two.

or something like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to just assert!(align > 0)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value of this function has no meaning if align is not a power-of-two.

I agree this is a more appropriate way to phrase this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the internal comment:

align is guaranteed to be > 0, so align - 1 is always valid.

So maybe instead have the sentence be:

The return value of this function has no meaning if align is not a positive power of two

(also, stylistically, I think it should be "power of two" rather than "power-of-two" (the latter would be appropriate if it were used as an adjective, e.g. "this type has a power-of-two alignment"))

Any reason not to just assert!(align > 0)?

Please ignore me; I really don't know what I was thinking when I wrote that 😄

@alexcrichton
Copy link
Member

Also FWIW I'm sort of trying to head off differences between this and the global allocators RFC which is currently leaning on a custom trait that's relatively different from this one. I'd personally prefer to stick to one Allocator trait given the reasoning here, though!

@joshlf
Copy link
Contributor

joshlf commented Jun 6, 2017

@pnkfelix

In any case, it might be a good idea to come up with example Allocator implementations that more strongly motivate &mut self.

I've been working on a crate for a while now that implements a Slab Allocator (among others). It is a single-threaded algorithm, and thus all of my methods are &mut self. Note that this is technically an object allocator (it's SlabAllocator<T>, and allocates only T objects), but:

  • Object allocators and general allocators are very similar
  • If we eventually add an ObjectAllocator<T> trait, I think it'd be odd if it weren't basically symmetrical with the Allocator trait.

(side note: I'm not ready to publish the crate yet, so it's still in a private repo, but I'd be happy to add anybody who was curious to take a look)

@bors
Copy link
Contributor

bors commented Jun 19, 2017

📌 Commit 55a629d has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Jun 19, 2017

⌛ Testing commit 55a629d with merge 2615e54d274d11da3522640bbcce19b0479122f1...

@bors
Copy link
Contributor

bors commented Jun 20, 2017

💔 Test failed - status-travis

@Mark-Simulacrum
Copy link
Member

@bors retry

timeout on os x builders

@bors
Copy link
Contributor

bors commented Jun 20, 2017

⌛ Testing commit 55a629d with merge 1143eb2...

bors added a commit that referenced this pull request Jun 20, 2017
Allocator integration

Lets start getting some feedback on `trait Alloc`.

Here is:
 *  the `trait Alloc` itself,
 * the `struct Layout` and `enum AllocErr` that its API relies on
 * a `struct HeapAlloc` that exposes the system allocator as an instance of `Alloc`
 * an integration of `Alloc` with `RawVec`
 * ~~an integration of `Alloc` with `Vec`~~

 TODO
 * [x] split `fn realloc_in_place` into `grow` and `shrink` variants
 * [x] add `# Unsafety` and `# Errors` sections to documentation for all relevant methods
 * [x] remove `Vec` integration with `Allocator`
 * [x] add `allocate_zeroed` impl to `HeapAllocator`
 * [x] remove typedefs e.g. `type Size = usize;`
 * [x] impl `trait Error` for all error types in PR
 * [x] make `Layout::from_size_align` public
 * [x] clarify docs of `fn padding_needed_for`.
 * [x] revise `Layout` constructors to ensure that [size+align combination is valid](#42313 (comment))
 * [x] resolve mismatch re requirements of align on dealloc. See [comment](#42313 (comment)).
@bors
Copy link
Contributor

bors commented Jun 20, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 1143eb2 to master...

@bors bors merged commit 55a629d into rust-lang:master Jun 20, 2017
csssuf added a commit to csssuf/rust-uefi that referenced this pull request Jun 21, 2017
The Alloc trait for allocators was added in Rust nightly 2017/06/21
(from rust-lang/rust#42313). With the addition of allowing a global
allocator to be specified (pending in rust-lang/rust#42727) this will
obviate the need for the alloc_uefi crate.
csssuf added a commit to csssuf/rust-uefi that referenced this pull request Jun 21, 2017
The Alloc trait for allocators was added in Rust nightly 2017/06/21
(from rust-lang/rust#42313). With the addition of allowing a global
allocator to be specified (pending in rust-lang/rust#42727) this will
obviate the need for the alloc_uefi crate.
csssuf added a commit to csssuf/rust-uefi that referenced this pull request Jun 21, 2017
The Alloc trait for allocators was added in Rust nightly 2017/06/21
(from rust-lang/rust#42313). With the addition of allowing a global
allocator to be specified (pending in rust-lang/rust#42727) this will
obviate the need for the alloc_uefi crate.
csssuf added a commit to csssuf/rust-uefi that referenced this pull request Jun 21, 2017
The Alloc trait for allocators was added in Rust nightly 2017/06/21
(from rust-lang/rust#42313). With the addition of allowing a global
allocator to be specified (pending in rust-lang/rust#42727) this will
obviate the need for the alloc_uefi crate.
csssuf added a commit to csssuf/rust-uefi that referenced this pull request Jul 6, 2017
The Alloc trait for allocators was added in Rust nightly 2017/06/21
(from rust-lang/rust#42313). With the addition of allowing a global
allocator to be specified (pending in rust-lang/rust#42727) this will
obviate the need for the alloc_uefi crate.
@joshlf
Copy link
Contributor

joshlf commented Jul 6, 2017

Is there a reason that Layout isn't Copy? I'm planning on opening an issue to change that, but I wanted to make sure there wasn't a good reason for the omission first.

@pnkfelix
Copy link
Member Author

@joshlf Well, a reason Layout is not Copy is that I have wanted to allow for the freedom to track other info (such as the tree structure formed by composing layouts together) within a Layout on non-standard builds. Representing a tree structure would almost certainly preclude having Layout be Copy-able.

More succinctly: I didn't think it was a good idea to build in that sort of constraint, especially for a type that I suspect is going to be somewhat niche.

@joshlf
Copy link
Contributor

joshlf commented Jul 13, 2017

@pnkfelix Fair enough; that makes sense.

@joshlf
Copy link
Contributor

joshlf commented Jul 13, 2017

@pnkfelix Actually, a follow-up on this. What I've been doing in my code is just cloning Layouts when I need to pass them to alloc, dealloc, etc. However, if a Layout could be a tree-like structure, then that would make cloning somewhat expensive, at least when compared to the speed we expect out of a call to alloc for a performant allocator. Thus, maybe we want to have Alloc's methods take their Layout arguments by reference rather than by value?

bors added a commit that referenced this pull request Aug 13, 2017
Optimize allocation paths in RawVec

Since the `Alloc` trait was introduced (#42313) and it was integrated everywhere (#42727) there's been some slowdowns and regressions that have slipped through. The intention of this PR is to try to tackle at least some of them, but they've been very difficult to quantify up to this point so it probably doesn't solve everything.

This PR primarily targets the `RawVec` type, specifically the `double` function. The codegen for this function is now much closer to what it was before #42313 landed as many runtime checks have been elided.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.