-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[cdac] Implement NibbleMap lookup and tests #108403
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
52504e9
[cdac] Implement NibbleMap lookup and tests
lambdageek 43fb677
remove fixed fixme
lambdageek dc3c1a4
remove one more TODO
lambdageek 2620440
spelling
lambdageek 2a82299
NibbleMap: remove unused _codeHeaderSize field
lambdageek 10a2b6a
cleanup NibbleMap and tests
lambdageek e71aa09
NibbleMap: use a struct for MapUnit
lambdageek c3f51db
NibbleMap: use types
lambdageek 599d533
Clarify termination condition; clean up docs
lambdageek 4182125
fix dodgy math in example
lambdageek File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# Contract ExecutionManager | ||
|
||
This contract is for mapping a PC address to information about the | ||
managed method corresponding to that address. | ||
|
||
|
||
## APIs of contract | ||
|
||
**TODO** | ||
|
||
## Version 1 | ||
|
||
**TODO** Methods | ||
|
||
### NibbleMap | ||
|
||
Version 1 of this contract depends on a "nibble map" data structure | ||
that allows mapping of a code address in a contiguous subsection of | ||
the address space to the pointer to the start of that a code sequence. | ||
It takes advantage of the fact that the code starts are aligned and | ||
are spaced apart to represent their addresses as a 4-bit nibble value. | ||
|
||
Given a contiguous region of memory in which we lay out a collection of non-overlapping code blocks that are | ||
not too small (so that two adjacent ones aren't too close together) and where the start of each code block is preceeded by a code header aligned on some power of 2, | ||
we can break up the whole memory space into buckets of a fixed size (32-bytes in the current implementation), where | ||
each bucket either has a code block header or not. | ||
Thinking of each code block header address as a hex number, we can view it as: `[index, offset, zeros]` | ||
where each index gives us a bucket and the offset gives us the position of the header within the bucket. | ||
We encode each offset into a 4-bit nibble, reserving the special value 0 to mark the places in the map where a method doesn't start. | ||
|
||
To find the start of a method given an address we first convert it into a bucket index (giving the map unit) | ||
and an offset which we can then turn into the index of the nibble that covers that address. | ||
If the nibble is non-zero, we have the start of a method and it is near the given address. | ||
If the nibble is zero, we have to search backward first through the current map unit, and then through previous map | ||
units until we find a non-zero nibble. | ||
|
||
For example (all code addresses are relative to some unspecified base): | ||
|
||
Suppose there is code starting at address 304 (0x130) | ||
|
||
* Then the map index will be 304 / 32 = 9 and the byte offset will be 304 % 32 = 16 | ||
* Because addresses are 4-byte aligned, the nibble value will be 1 + 16 / 4 = 5 (we reserve 0 to mean no method). | ||
* So the map unit containing index 9 will contain the value 0x5 << 24 (the map index 9 means we want the second nibble in the second map unit, and we number the nibbles starting from the most significant) , or | ||
0x05000000 | ||
|
||
|
||
Now suppose we do a lookup for address 306 (0x132) | ||
* The map index will be 306 / 32 = 9 and the byte offset will be 306 % 32 = 18 | ||
* The nibble value will be 1 + 18 / 4 = 5 | ||
* To do the lookup, we will load the map unit with index 9 (so the second 32-bit unit in the map) and get the value 0x05000000 | ||
* We will then shift to focus on the nibble with map index 9 (which again has nibble shift 24), so | ||
the map unit will be 0x00000005 and we will get the nibble value 5. | ||
* Therefore we know that there is a method start at map index 9, nibble value 5. | ||
* The map index corresponds to an offset of 288 bytes and the nibble value 5 corresponds to an offset of (5 - 1) * 4 = 16 bytes | ||
* So the method starts at offset 288 + 16 = 304, which is the address we were looking for. | ||
|
||
Now suppose we do a lookup for address 302 (0x12E) | ||
|
||
* The map index will be 302 / 32 = 9 and the byte offset will be 302 % 32 = 14 | ||
* The nibble value will be 1 + 14 / 4 = 4 | ||
* To do the lookup, we will load the map unit containing map index 9 and get the value 0x05000000 | ||
* We will then shift to focus on the nibble with map index 9 (which again has nibble shift 22), so we will get | ||
the nibble value 5. | ||
* Therefore we know that there is a method start at map index 9, nibble value 5. | ||
* But the address we're looking for is map index 9, nibble value 4. | ||
* We know that methods can't start within 32-bytes of each other, so we know that the method we're looking for is not in the current nibble. | ||
* We will then try to shift to the previous nibble in the map unit (0x00000005 >> 4 = 0x00000000) | ||
* Therefore we know there is no method start at any map index in the current map unit. | ||
* We will then align the map index to the start of the current map unit (map index 8) and move back to the previous map unit (map index 7) | ||
* At that point, we scan backwards for a non-zero map unit and a non-zero nibble within the first non-zero map unit. Since there are none, we return null. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,6 +20,7 @@ public enum DataType | |
pointer, | ||
|
||
GCHandle, | ||
CodePointer, | ||
Thread, | ||
ThreadStore, | ||
GCAllocContext, | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
30 changes: 30 additions & 0 deletions
30
...ged/cdacreader/Microsoft.Diagnostics.DataContractReader.Abstractions/TargetCodePointer.cs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
using System; | ||
|
||
namespace Microsoft.Diagnostics.DataContractReader; | ||
|
||
public readonly struct TargetCodePointer : IEquatable<TargetCodePointer> | ||
{ | ||
public static TargetCodePointer Null = new(0); | ||
public readonly ulong Value; | ||
public TargetCodePointer(ulong value) => Value = value; | ||
|
||
public static implicit operator ulong(TargetCodePointer p) => p.Value; | ||
public static implicit operator TargetCodePointer(ulong v) => new TargetCodePointer(v); | ||
|
||
public static bool operator ==(TargetCodePointer left, TargetCodePointer right) => left.Value == right.Value; | ||
public static bool operator !=(TargetCodePointer left, TargetCodePointer right) => left.Value != right.Value; | ||
|
||
public override bool Equals(object? obj) => obj is TargetCodePointer pointer && Equals(pointer); | ||
public bool Equals(TargetCodePointer other) => Value == other.Value; | ||
|
||
public override int GetHashCode() => Value.GetHashCode(); | ||
|
||
public bool Equals(TargetCodePointer x, TargetCodePointer y) => x.Value == y.Value; | ||
public int GetHashCode(TargetCodePointer obj) => obj.Value.GetHashCode(); | ||
|
||
public TargetPointer AsTargetPointer => new(Value); | ||
|
||
public override string ToString() => $"0x{Value:x}"; | ||
} |
5 changes: 5 additions & 0 deletions
5
...e/managed/cdacreader/Microsoft.Diagnostics.DataContractReader.Abstractions/TargetNUInt.cs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,16 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
using System; | ||
using System.Diagnostics; | ||
|
||
namespace Microsoft.Diagnostics.DataContractReader; | ||
|
||
|
||
[DebuggerDisplay("{Hex}")] | ||
public readonly struct TargetNUInt | ||
{ | ||
public readonly ulong Value; | ||
public TargetNUInt(ulong value) => Value = value; | ||
|
||
internal string Hex => $"0x{Value:x}"; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On ARM32 platforms should this strip off the thumb bit? For reference, on ARM32 Thumb2 targets, the lowest bit is typically set on a code address, to indicate that the pointer refers to a code using the Thumb2 instruction set instead of the ARM instruction set.
I see this as a potential problem around the conversion to ulong here, as well as the AsTargetPointer api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea that's a good idea. Elsewhere (in the PrecodeStubs contract) I have an explicit helper that strips off the thumb bit:
I couldn't decide if that's something we want on the TargetCodePointer or on the Target (or on a contract, as I've prototyped it so far)
I think on
TargetCodePointer
makes the most sense, but then i'll need to store the mask in the code pointer instance at creation time (or make the conversion to aTargetPointer
depend on the current target) - and i wasn't sure about the usability of that approach