-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kill off-heap ObjectReference
, or make a safer to_address
method
#1120
Comments
The example does not explain why this is a problem. We allow off-heap |
To what I know, Julia and GHC use |
The problem in the current Currently, // Returns "TRUE" iff "p" points into the committed areas of the heap.
// This method can be expensive so avoid using it in performance critical
// code.
virtual bool is_in(const void* p) const = 0; So it doesn't guarantee One possible solution to Other public interfaces that involve |
|
I am using this feature for data section in bytecode. Some objects are in this static data section which is mmaped to memory allocated by VM rather than MMTK |
OK. We can't just remove The cleanest solution seems to be requiring all |
I kinda disagree with this whole conversation. I believe that it is actually Going by the doc comment for the function, nothing there mentions that it should be a valid object and neither does the SFT (which does the actual check) require it to actually be an object. This function should simply be to check if an arbitrary address can potentially be allocated by MMTk. Hence, a more appropriate name would be mmtk-core/src/memory_manager.rs Lines 611 to 615 in e79e94e
EDIT: Re: What happens if |
I think Excerpt from pub fn is_mmtk_object(addr: Address) -> bool {
use crate::mmtk::SFT_MAP;
SFT_MAP.get_checked(addr).is_mmtk_object(addr)
}
Maybe neither From the historical point of view, we had
The problem is, if There is no general way to find an address that is guaranteed to be inside the object given an To implement a checking method (such as
The semantics is:
In this way, Alternatively, we change
where |
I'm not sure I follow. |
For example, in a VM,
No. Not Using the example above, if we have an objref The problem is, the VM could potentially call |
In your example, you can have two ways of implementing the object model. One is the internal address of an
In the case of the latter, there is no issue as the thing stored on the stack is the cell and everything is happy. The case of the former is where you get the issues you mentioned. Also I feel like we always end up having discussions about |
Actually my example was like this:
That is, I remember that conversation now. The one on zulip talked about exactly this problem. I think your suggestion here was a good one, and now I am seriously considering it. MMTk may present the interface of type MaybeObjectReference = Address;
fn is_mmtk_object(possible_object: MaybeObjectReference) -> bool; I am just using But But
We must have a version of In the Zulip conversation, I was arguing that the offset must be constant. But after a second thought, at least for implementing
Because VO bits are set exactly where valid objects are, as long as invalid I don't know what we should do for But if we need something like OpenJDK's |
I'm not sure why simply adding bounds check wont solve the issue you have? |
Adding another version of mmtk-core/src/vm/object_model.rs Lines 444 to 453 in e79e94e
If I understand right, there are two problems with this issue:
The answer to 1 is simple: we can just filter out any obviously low address (e.g. anything below 1K cannot be an object). We have two choices for 2: either have a different One thing that can help us to decide is whether a binding would implement the two |
(1) Because (2) Even with a bounds check, what should it return if it finds |
Yes. The doc comment you quoted already considered this possibility. But the semantics is under-specified.
Most operations, such as
Yes. When the user calls
As I discussed before, this is also possible.
And I also noticed that the doc comment of
Due to this requirement, there is only one way to implement this function, as long as not out of bound. I can't think of a VM that doesn't require I don't know which one is better
|
The bounds check I'm describing is for the |
Well, in this case bounds check should help. But since MMTk doesn't know the offset, we can offload the responsibility of bounds check to the user. It's the same as "the caller must ensure |
The point is that if Re: How the VM is coming up with these addresses is a slightly different question, i.e. if the VM finds an address on the stack while conservatively stack scanning. I don't have a great answer to this question. MMTk can't magically make inefficient conversions between |
The question is simple, but the answer is not. The capability for MMTk to answer the question is limited by the implementation.
That's basically what makes this "simple" question difficult to answer. And the VM must help in answering the question by implementing
|
My point is that if the question is simple then the API should be simple as well. Yes, the answer may not be simple, which is fine. We need to make the contract clearer in such a case. But a complex API will only increase confusion instead.
Right I see.
This is an expectation we have from the users of the API. Only aligned pointers should be queried. Currently we don't have VM-specific configurable |
TL;DR: We currently don't prevent
ObjectReference
to refer to off-heap objects, and we even haveActivePlan::vm_trace_object
for off-heap objects. However, if we callObjectReference(15).to_address()
, andObjectModel::to_address(object)
returnsobject.to_raw_address - 16
, it will overflow and have undefined behavior. That basically prevents us from doing anything useful that involveto_address
(including accessing SFT or on-the-side metadata) without an extra overflow test. And we basically have to choose between killing off-heapObjectReference
and adding overflow check to everyto_address
call.The problem
I found the problem when testing the function
is_in_mmtk_space
. The current signature and implementation is:Suppose the value of
object
is 1, and the VM may implementObjectModel::to_address(obj)
asreturn obj - 16;
(like what JikesRVM does). Then it will callObjectReference(1).to_address()
, and it will attempt to compute1 - 16
which will cause an overflow. It will panic in the debug mode, or have undefined behavior in the release mode.The cause
The cause is that the user may pass any value allowed by
ObjectReference
as theobject
argument. Currently it isusize
, and will be changed toNonZeroUsize
later. This is allowed becauseis_in_mmtk_spaces
is mainly used for differentiating MMTk objects from off-heap objects, and is supposed to let the user check against anything allowed to beObjectReference
.Our current definition of
ObjectReference
doesn't forbid encoding addresses of off-heap object as anObjectReference
. We don't define what a reference to an off-heap object looks like. 0 is obviously unlikely to be a valid reference, but we don't prevent the VM from using small integers like 1, 2, 3, 4, ..., 15, ... to represent a reference to an off-heap object. ButObjectReference::to_address(self) -> Address
also doesn't requireself
to be an MMTk object, either. So it is possible that someone may callObjectReference(small_integer).to_address()
.A deeper problem
That leads to a deeper problem: How many places in MMTk-core assumes
ObjectReference
refers to an MMTk object?One obvious problem is the SFT. The
to_address
method is originally introduced to help accessing on-the-side metadata and accessing the SFT. That's whyto_address
returns an address guaranteed to be within the allocated range of the object. If anObjectReference
refers to an off-heap object, that will not make sense. But in order to access the SFT, we always callobjref.to_address()
first. This means we can't do anything with SFT ifobject
is off-heap. That include basic things, such as callingtrace_object(object)
, even though we definedActivePlan::vm_trace_object
for tracing off-heap objects. Currently,SFTProcessEdges
never callsActivePlan::vm_trace_object
.EmptySpaceSFT::sft_trace_object
panics immediately.Solution
There are several solutions;
Require that
ObjectReference
always refers to MMTk objectThis immediately kills
ActivePlan::vm_trace_object
and everything supporting that. Currently no VM usesActivePlan::vm_trace_object
. Despite that, we need to think carefully when making this decision.Make a safer
ObjectModel::to_address
Make an
ObjectModel::to_address_safe(object: ObjectReference) -> Option<Address>
so that if it ever attempts to execute something like1 - 16
, it returnsNone
instead of having undefined behavior.usize::checked_sub
can help with this, but all users now require a check againstNone
, includingsft_trace_object
which, when a VM does not have off-heap object, should always seeSome(addr)
.The text was updated successfully, but these errors were encountered: