Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No ProcessEdgesWork in API functions #611

Merged
merged 9 commits into from
Jul 4, 2022

Conversation

wks
Copy link
Collaborator

@wks wks commented Jun 21, 2022

This PR removes the ProcessEdgesWork trait from the root-scanning API.

Currently, mmtk-core passes the <E: ProcessEdgesWork> type to the binding during root scanning, and the binding creates the work packets using the ProcessEdgesWork::new trait method. This couples root scanning with the ProcessEdgesWork trait.

But in reality, the ProcessEdgesWork trait may not be applicable to some root-scanning tasks.

  • One counter-example is the CMImmixCollectRootEdges work packet in LXR. It does not use most of the code provided by ProcessEdgesWork, such as trace_object, process_node or flush. It merely visits all root edges. Actually it only visits the objects the root edges refer to and does not forward the root edges because the concurrent marking is non-copying.
  • The other counter-example is the Ruby binding. Ruby uses conservative stack scanning, and does not expose the addresses of the roots. Some roots may be held by complex C data structures and are not supposed to be forwarded. Therefore, Ruby can only report "root objects" rather than "root edges'.

This PR introduces a trait:

pub trait RootsWorkFactory: Send {
    fn create_process_edge_roots_work(&self, edges: Vec<Address>);
    fn create_process_node_roots_work(&self, nodes: Vec<ObjectReference>);

    fn fork(&self) -> Box<dyn RootsWorkFactory>;
}

A RootsWorkFactory instance is passed to the VM for scanning roots. The VM reports root edges and root nodes using this object. In this way, the VM no longer needs to know what work packet mmtk-core uses to handle the roots.

The fork method is for sharing the factory among multiple threads. Some VMs (such as OpenJDK and JikesRVM) create additional work packets to scan different kinds of roots. For example, when scanning VM-specific roots, OpenJDK will create ScanUniverseRoots, ScanJNIHandlesRoots, ScanObjectSynchronizerRoots, ..., and JikesRVM will create ScanStaticRoots and ScanGlobalRoots. The VM can fork the factory so that each such work packet can hold a copy.

wks added a commit to wks/mmtk-v8 that referenced this pull request Jun 21, 2022
Notably, mmtk-v8 no longer mentions ProcessEdgesWork.
@wks wks force-pushed the api-no-process-edges-work branch from 5b73a30 to 6aaeca5 Compare June 21, 2022 08:23
wks added a commit to wks/mmtk-v8 that referenced this pull request Jun 21, 2022
Notably, mmtk-v8 no longer mentions ProcessEdgesWork.
@wks wks force-pushed the api-no-process-edges-work branch from 6aaeca5 to ed1d16c Compare June 21, 2022 08:54
@wks
Copy link
Collaborator Author

wks commented Jun 21, 2022

binding-refs
OPENJDK_BINDING_REPO=wks/mmtk-openjdk
OPENJDK_BINDING_REF=api-no-process-edges-work
JIKESRVM_BINDING_REPO=wks/mmtk-jikesrvm
JIKESRVM_BINDING_REF=api-no-process-edges-work
V8_BINDING_REPO=wks/mmtk-v8
V8_BINDING_REF=api-no-process-edges-work

Root-scanning functions no longer pass concrete work packet types as
type parameters.  Particularly, the API no longer exposes the
ProcessEdgesWork trait to the VM bindign.  Instead, mmtk-core passes
factory objects and call-back functions so that VM bindings can call
back to mmtk-core to report lists of root edges and the event that a
mutator has stopped.

With this change, VM bindings should not create work packets to handle
root edges directly.  Instead, the factories and call-backs defined in
mmtk-core shall create those packets.
@wks wks force-pushed the api-no-process-edges-work branch from ed1d16c to 6b45222 Compare June 21, 2022 09:52
@wks wks marked this pull request as ready for review June 21, 2022 10:41
@qinsoon qinsoon added the PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead) label Jun 22, 2022
@wks wks requested review from qinsoon and wenyuzhao June 22, 2022 03:15
Copy link
Member

@qinsoon qinsoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -261,7 +263,7 @@ impl<E: ProcessEdgesWork> VMProcessWeakRefs<E> {
impl<E: ProcessEdgesWork> GCWork<E::VM> for VMProcessWeakRefs<E> {
fn do_work(&mut self, worker: &mut GCWorker<E::VM>, _mmtk: &'static MMTK<E::VM>) {
trace!("ProcessWeakRefs");
<E::VM as VMBinding>::VMCollection::process_weak_refs::<E>(worker);
<E::VM as VMBinding>::VMCollection::process_weak_refs(worker); // TODO: Pass a factory/callback to decide what work packet to create.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What changes are needed for this TODO? Adding the factory as an argument should be trivial, but we might need to add more methods to the factory trait?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I wanted to talk about this yesterday, but forgot.

The problem is, this work packet VMProcessWeakRefs was introduced long time ago, but it still does nothing. VMCollection::process_weak_refs is a no-op, and it is not even overridden in V8. So I doubt if it is helpful to fix this work packet.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem is, the work packet for processing weak refs may or may not be the same as ProcessEdgesWork. I elaborated that in #604

My current finding is, what weak-reference processing needs is simpler than ProcessEdgesWork, even simpler than the TracingDelegate I mentioned in that issue. The weak reference processor only needs the capability to call trace_object. For our current GC algorithms which are all based on tracing, the weak reference processor should create ScanObjects work packet for all the objects enqueued with trace_object. This part is the same as ProcessEdgesWork. But I am not sure for RC-based plans.

I think we can leave the code like this, and change it when a VM (such as V8) actually started using this work packet.

We also discussed about refactoring the weak reference processors to adapt to VMs where weak references are not represented as objects (such as Ruby where the primitive mechanism for weak references is WeakMap). If that is done, I think this work packet may be removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could ask @wenyuzhao about how his V8 branch deals with the type parameter W in process_weak_refs().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to do anything for this PR. But it would be good to know that the design can deal with this issue in the future.

@@ -277,7 +279,8 @@ impl<E: ProcessEdgesWork> ScanStackRoots<E> {
impl<E: ProcessEdgesWork> GCWork<E::VM> for ScanStackRoots<E> {
fn do_work(&mut self, worker: &mut GCWorker<E::VM>, mmtk: &'static MMTK<E::VM>) {
trace!("ScanStackRoots");
<E::VM as VMBinding>::VMScanning::scan_thread_roots::<E>();
let factory = ProcessEdgesWorkRootsWorkFactory::<E>::new_boxed(mmtk);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing creating the factory here is just a temporary step for the PR, and eventually, the plan should be responsible for creating the factory. Is this understanding correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Eventually, it will be the plan's responsibility to create the factory it needs. The plan may even create a different factory for each different kind of GC (such as nursery vs full-heap, as well as concurrent marking).

I also plan to refactor the ProcessEdgesWork, but not in this PR. This PR only changes the VM-facing API.

fn create_process_node_roots_work(&self, nodes: Vec<ObjectReference>);

/// Create a copy of the factory itself.
fn fork(&self) -> Box<dyn RootsWorkFactory>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make RootsWorkFactory: Sync, and use Arc<dyn RootsWorkFactory> to pass the factory to the binding? It will be more clear that the factory can be shared among threads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ProcessEdgesWorkRootsWorkFactory has only one field mmtk: &'static MMTK<E::VM>, we can further make RootsWorkFactory: Copy to avoid heap allocation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I thought cloning an Arc would be more expensive than cloning a Box because of the underlying atomic memory operation. But after a second thought, I think that cost should be trivial, and the RootsWorkFactory should be freely sharable among threads because both of the required create_process_{node,edge}_roots_work methods take an immutable &self reference.

I'll try to change it to Arc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arc would be slightly different than this fork() model:

  1. With Arc, the factory probably won't have any mutable state (unless lock is used).
  2. If the factory has any state, fork() may choose to create a fresh instance or inherit the state. But with Arc, as they point to the same instance, the state is shared.

I am not sure which is more suitable when we have other RootsWorkFactory impls.

Copy link
Member

@wenyuzhao wenyuzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

fn create_process_node_roots_work(&self, nodes: Vec<ObjectReference>);

/// Create a copy of the factory itself.
fn fork(&self) -> Box<dyn RootsWorkFactory>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ProcessEdgesWorkRootsWorkFactory has only one field mmtk: &'static MMTK<E::VM>, we can further make RootsWorkFactory: Copy to avoid heap allocation.

@wks
Copy link
Collaborator Author

wks commented Jun 22, 2022

@wenyuzhao Regarding heap allocation, I am thinking about several design choices.

  1. An RootsWorkFactory instance may be small (like this, which contains a single &MMTK reference), but may be big for some plans which I don't know.
  2. Multiple threads may scan roots in parallel, so it must be "fork"-able, and it must have 'static lifetime (i.e. must not reference local variables).
  3. Currently the interface I designed passes &dyn RootsWorkFactory to the VM to avoid unnecessary generic functions. But it shouldn't be a problem to use impl RootsWorkFactory, either.

If it is dyn, it has to be heap-allocated because its size cannot be statically determined. But if we change it to impl, we may pass a RootWorkFactory instance by value, and let it implement Clone or Copy if that's cheap enough. If a RootWorkFactory instance is large in size, maybe passing Box or even Arc can be more efficient.

I'll try some possibilities, especially using impl RootWorkFactory and implementing Clone (maybe not Copy because it may be big), and see if I can eliminate the heap allocation if needed.

wks added 3 commits June 24, 2022 11:35
This allows unboxed instances to be passed to Scanning::scan_XXX_roots.

RootsWorkFactory now requires the Clone trait, and removed fork().
@wks
Copy link
Collaborator Author

wks commented Jun 24, 2022

I switched to a different approach. We now pass impl RootsWorkFactory instead of Box<dyn RootsWorkFactory> to Scanning::scan_xxx_roots. This eliminates the use of dyn. Not using dyn also allows RootsWorkFactory to implement Clone, so it can use the clone() method, and no longer needs fork().

Now RootsWorkFactory is passed by value. For small factories that holds only one or two references (such as &MMTK), passing by value is efficient enough. If the factory is big, it can hold an Arc inside it, so when clone(), it will clone the Arc instead of the underlying data structure. If it needs mutable states, the non-shared parts will still be cloned. This pushes the decision of whether to use plain values, Box or Arc to the implementer of RootsWorkFactory.

@qinsoon
Copy link
Member

qinsoon commented Jun 27, 2022

@wks Is there any change you would add to this pull request? Otherwise, we can merge it.

@wks
Copy link
Collaborator Author

wks commented Jun 27, 2022

@qinsoon No, but the CI tests for mmtk-jikesrvm binding is failing for some reasons.

@qinsoon qinsoon merged commit f0fed5f into mmtk:master Jul 4, 2022
qinsoon pushed a commit to mmtk/mmtk-v8 that referenced this pull request Jul 5, 2022
* Update for upstream PR mmtk/mmtk-core#611

Notably, mmtk-v8 no longer mentions ProcessEdgesWork.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants