Mimalloc Allocator #643

paigereeves · 2022-08-17T06:21:04Z

This pull request introduces an allocator based on Microsoft's mimalloc (https://www.microsoft.com/en-us/research/uploads/prod/2019/06/mimalloc-tr-v1.pdf) used in a MarkSweep GC. This closes #348.

This PR has a few leftover issues: #688

General changes in this PR (if requested, I can separate these changes to different PRs):

destroy_mutator():
- now takes &mut Mutator instead of Box<Mutator>. The reclamation of the Mutator memory is up to the binding, and allows the binding to copy the Mutator value to their thread local data structure, in which case, Rust cannot reclaim the memory as if it is a Box.
- now calls each allocator for their allocator-specific on_destroy() behavior.
Extract Chunk and ChunkMap from Immix, and make it general for other policies to use.
Changes in VMBinding constants:
- LOG_MIN_ALIGNMENT is removed. We still provide MIN_ALIGNMENT. This avoids confusion that a binding may override one constant but not the other.
- MAX_ALIGNMENT_SHIFT is removed. We provide MAX_ALIGNMENT for the same reason as above.
- Add USE_ALLOCATION_OFFSET: bool. If a binding never use allocation offset (i.e. offset === 0), they can set this to false, and MMTk core could use this for some optimizations.
Changes in ObjectModel:
- Add UNIFIED_OBJECT_REFERENCE_ADDRESS: bool to allow a binding to tell us if the object reference uses the same address as its object start and its to_address.
- Add OBJECT_REF_OFFSET_LOWER_BOUND to allow binding to tell us roughly where the object reference points to (with respect of the allocation address).
- Changes to related methods to cope with this change.

Mark sweep changes in this PR:

add two features for marksweep
- eager_sweeping: sweep unused memory eagerly in each GC. Without this feature, unused memory is swept when we attempt to allocate from them.
- malloc_mark_sweep: the same as our previous malloc_mark_sweep. When this is used, the mark sweep plan uses MallocSpace and MallocAllocator.
Move the current policy::mallocspace to policy::marksweepspace::malloc_ms
Add policy::marksweepspace::native_ms for the mark sweep space with mimalloc.
Add util::alloc::free_list_allocator.

wks · 2022-11-11T08:45:54Z

src/policy/marksweepspace/native_ms/global.rs

+impl<VM: VMBinding> GCWork<VM> for SweepChunk<VM> {
+    #[inline]
+    fn do_work(&mut self, _worker: &mut GCWorker<VM>, _mmtk: &'static MMTK<VM>) {
+        debug_assert!(self.space.chunk_map.get(self.chunk) == ChunkState::Allocated);
+        // number of allocated blocks.
+        let mut allocated_blocks = 0;
+        // Iterate over all allocated blocks in this chunk.
+        for block in self
+            .chunk
+            .iter_region::<Block>()
+            .filter(|block| block.get_state() != BlockState::Unallocated)
+        {
+            if !block.attempt_release(self.space) {
+                // Block is live. Increment the allocated block count.
+                allocated_blocks += 1;
+            }
+        }
+        // Set this chunk as free if there is not live blocks.
+        if allocated_blocks == 0 {
+            self.space.chunk_map.set(self.chunk, ChunkState::Free)
+        }
+    }
+}


I have some radical thoughts about this. Maybe this SweepChunk work packet is completely unnecessary.

If each BlockList is owned by exactly one mutator (or the global MarkSweepSpace), and each Block is owned by exactly one BlockList, then we can traverse all existing blocks simply by traversing all BlockList instances. That means visiting all BlockList in mutators and the lists in MarkSweepSpace::abandoned_lists. Doing so allows blocks to be swept while the current thread has exclusive access to its BlockList, therefore removing the need to have lock() in BlockList completely.

But I prefer doing this under the protection of Rust's ownership model. From the current code, it looks like there is no way for Block instances to be created anywhere else (other than MarkSweepSpace::acquire_block), held anywhere else, or discarded before they are swept. Rust's ownership model can enforce that, but currently our Block type implements Copy, has pointer semantics, and bypasses the ownership model. I elaborated my thoughts here: #696.

This PR removes most of our uses of `ObjectReference::to_address()`, and replace that with `ObjectModel::ref_to_address()`. With this change, the address we get from an object reference is always guaranteed to be within the allocation. Thus changes in #555 are reverted in this PR. This PR also addresses the comments raised in #643 (comment). Changes: * Replace our use of `ObjectReference::to_address()` to `ObjectModel::ref_to_address()` * Changes to `ObjectReference`: * Rename `ObjectReference::to_address()` to `ObjectReference::to_raw_address()`, and make it clear that we should avoid using it. * Remove `Address::to_object_reference()`, and add `ObjectReference::from_raw_address()`. Make it clear that we should avoid using it. * Changes to `ObjectModel`: * add `address_to_ref` which does the opposite of `ref_to_address` * add `ref_to_header` * rename `object_start_ref` to `ref_to_object_start`, to make it consistent with other methods. * Change `Treadmill` to store `ObjectReference` instead of `Address`. We previously stored object start address in `Treadmill` and we assumed alloc bit is set on the object start address. With changes in this PR, alloc bit is no longer set on object start address. I made changes accordingly. * Remove code about 'object ref guard' which was used to deal with the case that an object reference (address) may not be in the same chunk as the allocation. That should no longer happen. * `alloc_bit::is_alloced_object(address)` now returns an `Option<ObjectReference>`. We may consider return `Option<ObjectReference>` with our API `is_mmtk_object()` as well, but i did not include this change in the PR.

qinsoon · 2022-11-28T03:33:49Z

I believe I have addressed the issues you pointed out. Can you review the PR again and let me know if there are any further changes required? @wks @wenyuzhao

wks

Mostly OK. Only some small problems remain.

src/util/address.rs

src/vm/object_model.rs

wks · 2022-11-29T04:44:39Z

It looks good to me now.

wks · 2022-11-30T08:48:29Z

I managed to get the mmtk-ruby binding working. However, if I switch from malloc MarkSweep to native MatkSweep, it will crash in post_alloc when allocating new objects after GC because the VO-bit is already set. (FYI, Ruby uses the "is_mmtk_object" feature which turns on the "global_alloc_bit" feature.)

I am still investigating.

wks · 2022-11-30T09:35:30Z

I think the implementation of bzero_metadata is unsound when we use it to clear a bit range, not whole pages. Log shows that sometimes bzero_metadata will clear 0 bytes when clearing the VO-bits for a 56-byte cell.

[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff1, end: 0xc0804015ff2, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fca8
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fcb0, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff2, end: 0xc0804015ff3, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fce0
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fce8, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff3, end: 0xc0804015ff4, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fd18
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fd20, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff4, end: 0xc0804015ff5, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fd50
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fd58, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff5, end: 0xc0804015ff6, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fd88
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fd90, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff6, end: 0xc0804015ff7, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fdc0
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fdc8, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff7, end: 0xc0804015ff8, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fdf8
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fe00, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff8, end: 0xc0804015ff8, len: 0
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fe30
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fe38, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff8, end: 0xc0804015ff9, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fe68
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fe70, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff9, end: 0xc0804015ffa, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fea0
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fea8, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ffa, end: 0xc0804015ffb, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fed8
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fee0, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ffb, end: 0xc0804015ffc, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057ff10
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057ff18, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ffc, end: 0xc0804015ffd, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057ff48
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057ff50, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ffd, end: 0xc0804015ffe, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057ff80
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057ff88, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ffe, end: 0xc0804015fff, len: 1
[2022-11-30T09:25:55Z INFO  mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057ffb8
[2022-11-30T09:25:55Z INFO  mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057ffc0, size: 56
[2022-11-30T09:25:55Z INFO  mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015fff, end: 0xc0804015fff, len: 0

bzero_metadata uses memset to set byte ranges. The granularity of VO-bit metadata is one bit per 8 bytes. Then a 56-byte cell corresponds to 7 bits in the VO-bit metadata. As a result, if we use bzero_metadata to clear VO-bits when doing naive_brute_force_sweep, for every 8 consecutive cells, 7 of them will have the start of the VO-bits in one byte, and the end of the VO-bits in another, while 1 of them have both the start and the end of the VO-bits in the same byte. That explains the log.

(FYI, for Ruby, obj.to_raw_address() == ObjectModel::ref_to_address(obj) == ObjectModel::ref_to_object_start(obj) + 8)

I think it is unsound to clear 0 or 1 bytes. It should only clear the bit range the object occupies, which may span multiple bytes.

wks

There is a problem related to clearing the VO-bit metadata. It may affect JikesRVM as well.

src/policy/marksweepspace/native_ms/block.rs

This PR fixes a bug in `bzero_metadata()` discussed in #643 (comment). When the data address range to be bulk zeroed cannot be mapped into whole bytes in the side metadata, the other bits in the bytes which should not be updated will get zeroed unexpectedly.

qinsoon · 2022-12-02T04:43:53Z

binding-refs
OPENJDK_BINDING_REF=mimalloc-ms-support
JIKESRVM_BINDING_REF=mimalloc-ms-support
V8_BINDING_REF=update-pr-643

wks

LGTM

This PR fixes a bug in `bzero_metadata()` discussed in mmtk#643 (comment). When the data address range to be bulk zeroed cannot be mapped into whole bytes in the side metadata, the other bits in the bytes which should not be updated will get zeroed unexpectedly.

This PR addresses the issues we discussed in mmtk#643 (comment). Basically, Rust expects `From<T>` to always succeed, and Rust suggests using `TryFrom<T>` if the conversion may fail. For us, turning an address into a region may fail. So we should not use `From<T>`. And we do not need the error handling with `TryFrom<T>`. Thus we just provide two methods: `Region::from_unaligned_address()` and `Region::from_aligned_address()`.

This PR removes most of our uses of `ObjectReference::to_address()`, and replace that with `ObjectModel::ref_to_address()`. With this change, the address we get from an object reference is always guaranteed to be within the allocation. Thus changes in mmtk#555 are reverted in this PR. This PR also addresses the comments raised in mmtk#643 (comment). Changes: * Replace our use of `ObjectReference::to_address()` to `ObjectModel::ref_to_address()` * Changes to `ObjectReference`: * Rename `ObjectReference::to_address()` to `ObjectReference::to_raw_address()`, and make it clear that we should avoid using it. * Remove `Address::to_object_reference()`, and add `ObjectReference::from_raw_address()`. Make it clear that we should avoid using it. * Changes to `ObjectModel`: * add `address_to_ref` which does the opposite of `ref_to_address` * add `ref_to_header` * rename `object_start_ref` to `ref_to_object_start`, to make it consistent with other methods. * Change `Treadmill` to store `ObjectReference` instead of `Address`. We previously stored object start address in `Treadmill` and we assumed alloc bit is set on the object start address. With changes in this PR, alloc bit is no longer set on object start address. I made changes accordingly. * Remove code about 'object ref guard' which was used to deal with the case that an object reference (address) may not be in the same chunk as the allocation. That should no longer happen. * `alloc_bit::is_alloced_object(address)` now returns an `Option<ObjectReference>`. We may consider return `Option<ObjectReference>` with our API `is_mmtk_object()` as well, but i did not include this change in the PR.

This PR fixes a bug in `bzero_metadata()` discussed in mmtk#643 (comment). When the data address range to be bulk zeroed cannot be mapped into whole bytes in the side metadata, the other bits in the bytes which should not be updated will get zeroed unexpectedly.

This pull request introduces an allocator based on Microsoft's mimalloc (https://www.microsoft.com/en-us/research/uploads/prod/2019/06/mimalloc-tr-v1.pdf) used in a MarkSweep GC. This closes mmtk#348. This PR has a few leftover issues: mmtk#688 General changes in this PR (if requested, I can separate these changes to different PRs): * `destroy_mutator()`: * now takes `&mut Mutator` instead of `Box<Mutator>`. The reclamation of the `Mutator` memory is up to the binding, and allows the binding to copy the `Mutator` value to their thread local data structure, in which case, Rust cannot reclaim the memory as if it is a `Box`. * now calls each allocator for their allocator-specific `on_destroy()` behavior. * Extract `Chunk` and `ChunkMap` from Immix, and make it general for other policies to use. * Changes in `VMBinding` constants: * `LOG_MIN_ALIGNMENT` is removed. We still provide `MIN_ALIGNMENT`. This avoids confusion that a binding may override one constant but not the other. * `MAX_ALIGNMENT_SHIFT` is removed. We provide `MAX_ALIGNMENT` for the same reason as above. * Add `USE_ALLOCATION_OFFSET: bool`. If a binding never use allocation offset (i.e. `offset === 0`), they can set this to `false`, and MMTk core could use this for some optimizations. * Changes in `ObjectModel`: * Add `UNIFIED_OBJECT_REFERENCE_ADDRESS: bool` to allow a binding to tell us if the object reference uses the same address as its object start and its to_address. * Add `OBJECT_REF_OFFSET_LOWER_BOUND` to allow binding to tell us roughly where the object reference points to (with respect of the allocation address). * Changes to related methods to cope with this change. Mark sweep changes in this PR: * add two features for marksweep * `eager_sweeping`: sweep unused memory eagerly in each GC. Without this feature, unused memory is swept when we attempt to allocate from them. * `malloc_mark_sweep`: the same as our previous `malloc_mark_sweep`. When this is used, the mark sweep plan uses `MallocSpace` and `MallocAllocator`. * Move the current `policy::mallocspace` to `policy::marksweepspace::malloc_ms` * Add `policy::marksweepspace::native_ms` for the mark sweep space with mimalloc. * Add `util::alloc::free_list_allocator`. Co-authored-by: Yi Lin <qinsoon@gmail.com>

paigereeves added 30 commits May 1, 2022 16:35

fix bulk zeroing

8e6aec6

fix returning blocks

7392d41

doubly linked listed, skeleton for block list metadata

940373f

fix copying blocklist and circular lists

9cb00b7

single threaded block sweep

47276d6

eager sweeping

c27becf

remove freelistmarksweep plan

7a485de

locking lists

b3fe5d3

no block level sweep for eager sweeping

e95d2b5

Interact with mimalloc

7f39090

reset

f9b16b2

free list allocator and mark sweep space skeletons

b0eee1a

alloc (untested)

eefef24

free list allocator slow path (untested)

3ee9d8d

get_size_class, make_free_list, init_size_classes (basic, untested)

a9e1aea

acquiring space

28c6ae4

added free lists struct (broken)

2e50599

cleaning

5a474db

fix segfault

f1cdcd0

restructure

620a6f6

free list allocator fix

9188ca1

Send large objects to the immortal space

a4aca0d

boxed allocator

c2cecc7

get_bin

5b63b96

_mi_bin

df09254

start on unoptimised blocks array

353b995

new_chunk bug

79f9de9

use metadata

df8bca5

fix segfaults

6b89a1b

refactor fast and slow paths to match mimalloc

df69b46

wks mentioned this pull request Nov 11, 2022

Owning region types #696

Open

wks reviewed Nov 11, 2022

View reviewed changes

qinsoon mentioned this pull request Nov 17, 2022

Refactor object reference to address #699

Merged

qinsoon added 2 commits November 28, 2022 11:08

Merge branch 'master' into mimalloc

25897b8

Remove object_ref_guard

b2ec1b8

wks requested changes Nov 28, 2022

View reviewed changes

src/util/address.rs Show resolved Hide resolved

src/vm/object_model.rs Outdated Show resolved Hide resolved

Fix grammar in doc

7002d5f

wks approved these changes Nov 29, 2022

View reviewed changes

wenyuzhao approved these changes Nov 30, 2022

View reviewed changes

wks requested changes Nov 30, 2022

View reviewed changes

qinsoon mentioned this pull request Dec 1, 2022

Mask and zero bits in metadata bytes for bulk zeroing #707

Merged

Merge branch 'master' into mimalloc

9b65404

Prefer using methods from the alloc_bit module

9a3e758

qinsoon added the PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead) label Dec 2, 2022

wks approved these changes Dec 2, 2022

View reviewed changes

The alloc bit may not be set when we sweep an empty cell

c06bd5f

qinsoon merged commit 6e1b4df into mmtk:master Dec 5, 2022

qinsoon mentioned this pull request May 14, 2023

Allocator specific flush/destroy #526

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mimalloc Allocator #643

Mimalloc Allocator #643

paigereeves commented Aug 17, 2022 •

edited by qinsoon

Loading

wks Nov 11, 2022

qinsoon commented Nov 28, 2022

wks left a comment

wks commented Nov 29, 2022

wks commented Nov 30, 2022

wks commented Nov 30, 2022

wks left a comment

qinsoon commented Dec 2, 2022

wks left a comment

Mimalloc Allocator #643

Mimalloc Allocator #643

Conversation

paigereeves commented Aug 17, 2022 • edited by qinsoon Loading

wks Nov 11, 2022

Choose a reason for hiding this comment

qinsoon commented Nov 28, 2022

wks left a comment

Choose a reason for hiding this comment

wks commented Nov 29, 2022

wks commented Nov 30, 2022

wks commented Nov 30, 2022

wks left a comment

Choose a reason for hiding this comment

qinsoon commented Dec 2, 2022

wks left a comment

Choose a reason for hiding this comment

paigereeves commented Aug 17, 2022 •

edited by qinsoon

Loading