Skip to content

Commit

Permalink
Mimalloc Allocator (mmtk#643)
Browse files Browse the repository at this point in the history
This pull request introduces an allocator based on Microsoft's mimalloc (https://www.microsoft.com/en-us/research/uploads/prod/2019/06/mimalloc-tr-v1.pdf) used in a MarkSweep GC. This closes mmtk#348.

This PR has a few leftover issues: mmtk#688

General changes in this PR (if requested, I can separate these changes to different PRs):
* `destroy_mutator()`:
  * now takes `&mut Mutator` instead of `Box<Mutator>`. The reclamation of the `Mutator` memory is up to the binding, and allows the binding to copy the `Mutator` value to their thread local data structure, in which case, Rust cannot reclaim the memory as if it is a `Box`.
  * now calls each allocator for their allocator-specific `on_destroy()` behavior.
* Extract `Chunk` and `ChunkMap` from Immix, and make it general for other policies to use.
* Changes in `VMBinding` constants:
  * `LOG_MIN_ALIGNMENT` is removed. We still provide `MIN_ALIGNMENT`. This avoids confusion that a binding may override one constant but not the other.
  * `MAX_ALIGNMENT_SHIFT` is removed. We provide `MAX_ALIGNMENT` for the same reason as above.
  * Add `USE_ALLOCATION_OFFSET: bool`. If a binding never use allocation offset (i.e. `offset === 0`), they can set this to `false`, and MMTk core could use this for some optimizations.
* Changes in `ObjectModel`:
  * Add `UNIFIED_OBJECT_REFERENCE_ADDRESS: bool` to allow a binding to tell us if the object reference uses the same address as its object start and its to_address.
  * Add `OBJECT_REF_OFFSET_LOWER_BOUND` to allow binding to tell us roughly where the object reference points to (with respect of the allocation address).
  * Changes to related methods to cope with this change.

Mark sweep changes in this PR:
* add two features for marksweep
  * `eager_sweeping`: sweep unused memory eagerly in each GC. Without this feature, unused memory is swept when we attempt to allocate from them.
  * `malloc_mark_sweep`: the same as our previous `malloc_mark_sweep`. When this is used, the mark sweep plan uses `MallocSpace` and `MallocAllocator`.
* Move the current `policy::mallocspace` to `policy::marksweepspace::malloc_ms`
* Add `policy::marksweepspace::native_ms` for the mark sweep space with mimalloc.
* Add `util::alloc::free_list_allocator`.

Co-authored-by: Yi Lin <qinsoon@gmail.com>
  • Loading branch information
2 people authored and wenyuzhao committed Mar 20, 2023
1 parent 52e7297 commit 70c0872
Show file tree
Hide file tree
Showing 54 changed files with 2,493 additions and 448 deletions.
7 changes: 1 addition & 6 deletions .github/scripts/ci-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,7 @@ for fn in $(ls src/tests/*.rs); do

# Run the test with each plan it needs.
for MMTK_PLAN in $PLANS; do
# Deal with mark sweep specially, we only have malloc mark sweep, and we need to enable the feature to make it work.
if [[ $MMTK_PLAN == 'MarkSweep' ]]; then
env MMTK_PLAN=$MMTK_PLAN cargo test --features "malloc_mark_sweep,$FEATURES" -- $t;
else
env MMTK_PLAN=$MMTK_PLAN cargo test --features "$FEATURES" -- $t;
fi
env MMTK_PLAN=$MMTK_PLAN cargo test --features "$FEATURES" -- $t;
done
done

18 changes: 13 additions & 5 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ atomic_refcell = "0.1.7"
strum = "0.24"
strum_macros = "0.24"
cfg-if = "1.0"
itertools = "0.10.5"

[dev-dependencies]
rand = "0.7.3"
Expand Down Expand Up @@ -115,11 +116,6 @@ work_packet_stats = []
# Count the malloc'd memory into the heap size
malloc_counted_size = []

# Use library malloc as the freelist allocator for mark sweep. This will makes mark sweep slower. As malloc may return addresses outside our
# normal heap range, we will have to use chunk-based SFT table. Turning on this feature will use a different SFT map implementation on 64bits,
# and will affect all the plans in the build. Please be aware of the consequence, and this is only meant to be experimental use.
malloc_mark_sweep = []

# Do not modify the following line - ci-common.sh matches it
# -- Mutally exclusive features --
# Only one feature from each group can be provided. Otherwise build will fail.
Expand All @@ -131,6 +127,18 @@ malloc_mark_sweep = []
malloc_mimalloc = ["mimalloc-sys"]
malloc_jemalloc = ["jemalloc-sys"]
malloc_hoard = ["hoard-sys"]
# Use the native mimalloc allocator for malloc. This is not tested by me (Yi) yet, and it is only used to make sure that some code
# is not compiled in default builds.
malloc_native_mimalloc = []

# If there are more groups, they should be inserted above this line
# Group:end

# Group:marksweepallocation
# default is native allocator with lazy sweeping
eager_sweeping = []
# Use library malloc as the freelist allocator for mark sweep. This will makes mark sweep slower. As malloc may return addresses outside our
# normal heap range, we will have to use chunk-based SFT table. Turning on this feature will use a different SFT map implementation on 64bits,
# and will affect all the plans in the build. Please be aware of the consequence, and this is only meant to be experimental use.
malloc_mark_sweep = []
# Group:end
21 changes: 16 additions & 5 deletions src/memory_manager.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,11 @@ pub fn mmtk_init<VM: VMBinding>(builder: &MMTKBuilder) -> Box<MMTK<VM>> {
Box::new(mmtk)
}

/// Request MMTk to create a mutator for the given thread. For performance reasons, A VM should
/// store the returned mutator in a thread local storage that can be accessed efficiently.
/// Request MMTk to create a mutator for the given thread. The ownership
/// of returned boxed mutator is transferred to the binding, and the binding needs to take care of its
/// lifetime. For performance reasons, A VM should store the returned mutator in a thread local storage
/// that can be accessed efficiently. A VM may also copy and embed the mutator stucture to a thread-local data
/// structure, and use that as a reference to the mutator (it is okay to drop the box once the content is copied).
///
/// Arguments:
/// * `mmtk`: A reference to an MMTk instance.
Expand All @@ -103,12 +106,14 @@ pub fn bind_mutator<VM: VMBinding>(
mutator
}

/// Reclaim a mutator that is no longer needed.
/// Report to MMTk that a mutator is no longer needed. A binding should not attempt
/// to use the mutator after this call. MMTk will not attempt to reclaim the memory for the
/// mutator, so a binding should properly reclaim the memory for the mutator after this call.
///
/// Arguments:
/// * `mutator`: A reference to the mutator to be destroyed.
pub fn destroy_mutator<VM: VMBinding>(mutator: Box<Mutator<VM>>) {
drop(mutator);
pub fn destroy_mutator<VM: VMBinding>(mutator: &mut Mutator<VM>) {
mutator.on_destroy();
}

/// Flush the mutator's local states.
Expand Down Expand Up @@ -144,6 +149,12 @@ pub fn alloc<VM: VMBinding>(
// If you plan to use MMTk with a VM with its object size smaller than MMTk's min object size, you should
// meet the min object size in the fastpath.
debug_assert!(size >= MIN_OBJECT_SIZE);
// Assert alignment
debug_assert!(align >= VM::MIN_ALIGNMENT);
debug_assert!(align <= VM::MAX_ALIGNMENT);
// Assert offset
debug_assert!(VM::USE_ALLOCATION_OFFSET || offset == 0);

mutator.alloc(size, align, offset, semantics)
}

Expand Down
2 changes: 1 addition & 1 deletion src/plan/global.rs
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ pub fn create_plan<VM: VMBinding>(
vm_map, mmapper, options, scheduler,
)) as Box<dyn Plan<VM = VM>>,
PlanSelector::MarkSweep => Box::new(crate::plan::marksweep::MarkSweep::new(
vm_map, mmapper, options,
vm_map, mmapper, options, scheduler,
)) as Box<dyn Plan<VM = VM>>,
PlanSelector::Immix => Box::new(crate::plan::immix::Immix::new(
vm_map, mmapper, options, scheduler,
Expand Down
74 changes: 3 additions & 71 deletions src/plan/marksweep/gc_work.rs
Original file line number Diff line number Diff line change
@@ -1,77 +1,9 @@
use crate::policy::mallocspace::metadata::is_chunk_mapped;
use crate::policy::mallocspace::metadata::is_chunk_marked_unsafe;
use crate::policy::mallocspace::MallocSpace;
use crate::scheduler::{GCWork, GCWorker, WorkBucketStage};
use crate::util::heap::layout::vm_layout_constants::BYTES_IN_CHUNK;
use crate::util::Address;
use crate::vm::VMBinding;
use crate::MMTK;
use std::sync::atomic::Ordering;

use super::MarkSweep;

/// Simple work packet that just sweeps a single chunk
pub struct MSSweepChunk<VM: VMBinding> {
ms: &'static MallocSpace<VM>,
// starting address of a chunk
chunk: Address,
}

impl<VM: VMBinding> GCWork<VM> for MSSweepChunk<VM> {
#[inline]
fn do_work(&mut self, _worker: &mut GCWorker<VM>, _mmtk: &'static MMTK<VM>) {
self.ms.sweep_chunk(self.chunk);
}
}

/// Work packet that generates sweep jobs for gc workers. Each chunk is given its own work packet
pub struct MSSweepChunks<VM: VMBinding> {
plan: &'static MarkSweep<VM>,
}

impl<VM: VMBinding> MSSweepChunks<VM> {
pub fn new(plan: &'static MarkSweep<VM>) -> Self {
Self { plan }
}
}

impl<VM: VMBinding> GCWork<VM> for MSSweepChunks<VM> {
#[inline]
fn do_work(&mut self, _worker: &mut GCWorker<VM>, mmtk: &'static MMTK<VM>) {
let ms = self.plan.ms_space();
let mut work_packets: Vec<Box<dyn GCWork<VM>>> = vec![];
let mut chunk = unsafe { Address::from_usize(ms.chunk_addr_min.load(Ordering::Relaxed)) }; // XXX: have to use AtomicUsize to represent an Address
let end = unsafe { Address::from_usize(ms.chunk_addr_max.load(Ordering::Relaxed)) }
+ BYTES_IN_CHUNK;

// Since only a single thread generates the sweep work packets as well as it is a Stop-the-World collector,
// we can assume that the chunk mark metadata is not being accessed by anything else and hence we use
// non-atomic accesses
while chunk < end {
if is_chunk_mapped(chunk) && unsafe { is_chunk_marked_unsafe(chunk) } {
work_packets.push(Box::new(MSSweepChunk { ms, chunk }));
}

chunk += BYTES_IN_CHUNK;
}

debug!("Generated {} sweep work packets", work_packets.len());
#[cfg(debug_assertions)]
{
ms.total_work_packets
.store(work_packets.len() as u32, Ordering::SeqCst);
ms.completed_work_packets.store(0, Ordering::SeqCst);
ms.work_live_bytes.store(0, Ordering::SeqCst);
}

mmtk.scheduler.work_buckets[WorkBucketStage::Release].bulk_add(work_packets);
}
}

pub struct MSGCWorkContext<VM: VMBinding>(std::marker::PhantomData<VM>);
use crate::policy::gc_work::DEFAULT_TRACE;
use crate::scheduler::gc_work::PlanProcessEdges;
use crate::scheduler::gc_work::*;
use crate::vm::VMBinding;

pub struct MSGCWorkContext<VM: VMBinding>(std::marker::PhantomData<VM>);
impl<VM: VMBinding> crate::scheduler::GCWorkContext for MSGCWorkContext<VM> {
type VM = VM;
type PlanType = MarkSweep<VM>;
Expand Down
91 changes: 49 additions & 42 deletions src/plan/marksweep/global.rs
Original file line number Diff line number Diff line change
@@ -1,44 +1,50 @@
use crate::plan::global::BasePlan;
use crate::plan::global::CommonPlan;
use crate::plan::global::GcStatus;
use crate::plan::marksweep::gc_work::{MSGCWorkContext, MSSweepChunks};
use crate::plan::marksweep::gc_work::MSGCWorkContext;
use crate::plan::marksweep::mutator::ALLOCATOR_MAPPING;
use crate::plan::AllocationSemantics;
use crate::plan::Plan;
use crate::plan::PlanConstraints;
use crate::policy::mallocspace::metadata::ACTIVE_CHUNK_METADATA_SPEC;
use crate::policy::mallocspace::MallocSpace;
use crate::policy::space::Space;
use crate::scheduler::*;
use crate::scheduler::GCWorkScheduler;
use crate::util::alloc::allocators::AllocatorSelector;
#[cfg(not(feature = "global_alloc_bit"))]
use crate::util::alloc_bit::ALLOC_SIDE_METADATA_SPEC;
use crate::util::heap::layout::heap_layout::Mmapper;
use crate::util::heap::layout::heap_layout::VMMap;
use crate::util::heap::HeapMeta;
use crate::util::heap::VMRequest;
use crate::util::metadata::side_metadata::{SideMetadataContext, SideMetadataSanity};
use crate::util::options::Options;
use crate::util::VMWorkerThread;
use crate::vm::VMBinding;
use enum_map::EnumMap;
use mmtk_macros::PlanTraceObject;
use std::sync::Arc;

use enum_map::EnumMap;
#[cfg(feature = "malloc_mark_sweep")]
pub type MarkSweepSpace<VM> = crate::policy::marksweepspace::malloc_ms::MallocSpace<VM>;
#[cfg(feature = "malloc_mark_sweep")]
use crate::policy::marksweepspace::malloc_ms::MAX_OBJECT_SIZE;

use mmtk_macros::PlanTraceObject;
#[cfg(not(feature = "malloc_mark_sweep"))]
pub type MarkSweepSpace<VM> = crate::policy::marksweepspace::native_ms::MarkSweepSpace<VM>;
#[cfg(not(feature = "malloc_mark_sweep"))]
use crate::policy::marksweepspace::native_ms::MAX_OBJECT_SIZE;

#[derive(PlanTraceObject)]
pub struct MarkSweep<VM: VMBinding> {
#[fallback_trace]
common: CommonPlan<VM>,
#[trace]
ms: MallocSpace<VM>,
ms: MarkSweepSpace<VM>,
}

pub const MS_CONSTRAINTS: PlanConstraints = PlanConstraints {
moves_objects: false,
gc_header_bits: 2,
gc_header_words: 0,
num_specialized_scans: 1,
max_non_los_default_alloc_bytes: MAX_OBJECT_SIZE,
may_trace_duplicate_edges: true,
..PlanConstraints::default()
};
Expand All @@ -56,7 +62,6 @@ impl<VM: VMBinding> Plan for MarkSweep<VM> {
self.base().set_collection_kind::<Self>(self);
self.base().set_gc_status(GcStatus::GcPrepare);
scheduler.schedule_common_work::<MSGCWorkContext<VM>>(self);
scheduler.work_buckets[WorkBucketStage::Prepare].add(MSSweepChunks::<VM>::new(self));
}

fn get_allocator_mapping(&self) -> &'static EnumMap<AllocationSemantics, AllocatorSelector> {
Expand All @@ -65,11 +70,11 @@ impl<VM: VMBinding> Plan for MarkSweep<VM> {

fn prepare(&mut self, tls: VMWorkerThread) {
self.common.prepare(tls, true);
// Dont need to prepare for MallocSpace
self.ms.prepare();
}

fn release(&mut self, tls: VMWorkerThread) {
trace!("Marksweep: Release");
self.ms.release();
self.common.release(tls, true);
}

Expand All @@ -95,47 +100,49 @@ impl<VM: VMBinding> Plan for MarkSweep<VM> {
}

impl<VM: VMBinding> MarkSweep<VM> {
pub fn new(vm_map: &'static VMMap, mmapper: &'static Mmapper, options: Arc<Options>) -> Self {
let heap = HeapMeta::new(&options);
// if global_alloc_bit is enabled, ALLOC_SIDE_METADATA_SPEC will be added to
// SideMetadataContext by default, so we don't need to add it here.
#[cfg(feature = "global_alloc_bit")]
let global_metadata_specs =
SideMetadataContext::new_global_specs(&[ACTIVE_CHUNK_METADATA_SPEC]);
// if global_alloc_bit is NOT enabled,
// we need to add ALLOC_SIDE_METADATA_SPEC to SideMetadataContext here.
#[cfg(not(feature = "global_alloc_bit"))]
let global_metadata_specs = SideMetadataContext::new_global_specs(&[
ALLOC_SIDE_METADATA_SPEC,
ACTIVE_CHUNK_METADATA_SPEC,
]);

let res = MarkSweep {
ms: MallocSpace::new(global_metadata_specs.clone()),
common: CommonPlan::new(
pub fn new(
vm_map: &'static VMMap,
mmapper: &'static Mmapper,
options: Arc<Options>,
scheduler: Arc<GCWorkScheduler<VM>>,
) -> Self {
let mut heap = HeapMeta::new(&options);
let mut global_metadata_specs = SideMetadataContext::new_global_specs(&[]);
MarkSweepSpace::<VM>::extend_global_side_metadata_specs(&mut global_metadata_specs);

let res = {
let ms = MarkSweepSpace::new(
"MarkSweepSpace",
false,
VMRequest::discontiguous(),
global_metadata_specs.clone(),
vm_map,
mmapper,
&mut heap,
scheduler,
);

let common = CommonPlan::new(
vm_map,
mmapper,
options,
heap,
&MS_CONSTRAINTS,
global_metadata_specs,
),
};
);

// Use SideMetadataSanity to check if each spec is valid. This is also needed for check
// side metadata in extreme_assertions.
{
let mut side_metadata_sanity_checker = SideMetadataSanity::new();
res.common
.verify_side_metadata_sanity(&mut side_metadata_sanity_checker);
res.ms
.verify_side_metadata_sanity(&mut side_metadata_sanity_checker);
}
MarkSweep { common, ms }
};

let mut side_metadata_sanity_checker = SideMetadataSanity::new();
res.common
.verify_side_metadata_sanity(&mut side_metadata_sanity_checker);
res.ms
.verify_side_metadata_sanity(&mut side_metadata_sanity_checker);
res
}

pub fn ms_space(&self) -> &MallocSpace<VM> {
pub fn ms_space(&self) -> &MarkSweepSpace<VM> {
&self.ms
}
}
2 changes: 1 addition & 1 deletion src/plan/marksweep/mod.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//! Plan: marksweep (currently using malloc as its freelist allocator)
//! Plan: marksweep
mod gc_work;
mod global;
Expand Down
Loading

0 comments on commit 70c0872

Please sign in to comment.