-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iterative generic math changes to match API review #69391
Conversation
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
if (destination.Length >= sizeof(byte)) | ||
{ | ||
byte value = m_value; | ||
Unsafe.WriteUnaligned(ref MemoryMarshal.GetReference(destination), value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this result in better codegen than just destination[0] = m_value;
? I'd have expected the guard above (which could also be if (!destination.IsEmpty)
) to remove the bounds check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not, but its simpler to implement and compare the code when they are all roughly the same size/shape so I've mostly just been doing write once
, copy/paste
, small tweak
.
I can update if you'd prefer it do the simpler thing here.
/// <summary>Tries to write the current value, in big-endian format, to a given span.</summary> | ||
/// <param name="destination">The span to which the current value should be written.</param> | ||
/// <param name="bytesWritten">The number of bytes written to <paramref name="destination" />.</param> | ||
/// <returns><c>true</c> if the value was succesfully written to <paramref name="destination" />; otherwise, <c>false</c>.</returns> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the by; my understanding is that <c>true</c>
and <c>false</c>
should actually be <see langword="true" />
and <see langword="false"/>
so that they capitalize when changing to the VB version. (Same with null
=> Nothing)
Carlos' tool probably handles those as special-cased; but thought I'd share for future reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carlossanlop is this handled or should I be explicitly using see langword
?
|
||
i++; | ||
} | ||
while ((part == 0) && (i < bits.Length)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What spins here, and do we have boundary tests for it? (e.g. something does the do and exits to the other loop, something else runs the body twice, something runs thrice, and maybe 4 isn't possible?)
(A comment in the code to help future maintainers would be helpful).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No sure what you're asking for here? This handles the conversion from one's complement
to two's complement
.
The first loop handles 0x0000_0000
as the one's complement of that is 0x0000_0000
. The second loop handles the remaining bits in the value.
0x0000_0000_0000_0000_0000_0000
becomes0x0000_0000_0000_0000_0000_0000
0x0000_0000_0000_0000_0000_0001
becomes0xFFFF_FFFF_FFFF_FFFF_FFFF_FFFF
0x0000_0000_0000_0001_0000_0000
becomes0x0000_0000_FFFF_FFFF_0000_0000
This handles what is logically ~x + 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to understand what values of a BigInteger might cause the do/while to run more than once. Given that there's all of 1s complement, 2s complement, little endian, and big endian present, I can't follow the data flow in my head.
- Negative values
-1
..int.MinValue + 1
all fit in_sign
and_bits
is null, so none of those are relevant. int.MinValue
produces a one element_bits
value, so doesn't loop.int.MinValue - 1
(0xFFFFFFFF_7FFFFFFF) is, I believe, stored as{ 0x00000000, 0x80000000 }
- The first time through the loop we load the 0. So now do ~0 + 1, which I guess goes back to 0.
- Yay, this value looped.
- The second time through we load the 0x8000_0000 and produce, uh, 0x8000_0000?
- It's not zero, so we'd exit, but we're also out of data, so we'd exit anyways.
- The first time through the loop we load the 0. So now do ~0 + 1, which I guess goes back to 0.
So perhaps the answer is "anything in the tests ending in multiples of 4 zero-byte values (BE) went through the top loop (trailing_zero_bytes_count / 4) + 1 times".
If that's the case, I'd expect to see tests that
- Don't end in 4 zeros, but are at least 8 bytes (do+break+second_loop)
- End in 4 zeros and are 8 bytes (do+do+return)
- End in 4 zeros and are more than 8 bytes (do+do+break+second_loop)
- End in 8 zeros and are 12 bytes (do+do+do+return)
- End in 8 zeros and are more than 12 bytes (do+do+do+break+second_loop)
Ideally with each of the second_loop cases having a one/more_than_one case (and super ideally as one/two/more_than_two, since sometimes "two" is special)
If I understand the states now, I don't see any tests that produce 12 or more bytes, so there's nothing that runs that top loop 3 or more times.
The thing I'd hope for here is a comment talking about some of the boundary cases. i.e. a particular value that will run the top loop twice, then the bottom loop twice, but one more (or less?) would have a different execution characteristic (which I'm not feeling knowledgeable enough to predict if that means 1/3 or 3/1, or a black hole, or what). And those same values being codified in tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this method in particular, we have tests covering everything but the 12 + byte
case because it requires significant more setup to do and represent.
I can certainly add it, but its not interesting IMO. This logic is the same as we have in WriteLittleEndian
and that is used in DangerousMakeTwosComplement
, etc. The only difference is that it writes from the end of the buffer to the beginning, rather than from the beginning to the end and the current tests ensure that we are computing the correct results with both loops executing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is the same as we have in WriteLittleEndian
Which looks to be equally under-tested.
and that is used in DangerousMakeTwosComplement, etc.
To me, the fact that there are three copies of the code means there need to be more tests, not fewer, so that someone who changes something about only two of the three is likely to see failures from the remaining one.
Having read the comments in DangerousMakeTwosComplement I now better understand the reason for two distinct pieces of logic (the first loop is the +1 of the overall (~x + 1), and then also any carry values from the lower significance segments), but I genuinely didn't get that impression when reading the TryWriteLittleEndian and TryWriteBigEndian code blocks... it just came across as magic.
I still think that tests in the 12-16 byte range are useful for both TryWriteBigEndian and TryWriteLittleEndian, and even a comment as sparse as what DangerousMakeTwosComplement has about the carry bit would go a long way toward maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. There was indeed an issue with a specific edge case value due to it not accounting for needing an extra "part" to hold the sign.
In particular, signed.MinValue - 1
is represented in one's complement
as 0x8000_0000, 0x0000_0001
, swapped to two's complement
this is 0x7FFF_FFFF, 0xFFFF_FFFF
which is of course signed.MaxValue
instead. We need it to be 0xFFFF_FFFF, 0x7FFF_FFFF, 0xFFFF_FFFF
so that the sign is still tracked instead.
|
||
Assert.True(FloatingPointHelper<double>.TryWriteSignificandBigEndian(double.NegativeInfinity, destination, out bytesWritten)); | ||
Assert.Equal(8, bytesWritten); | ||
Assert.Equal(new byte[] { 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, destination.ToArray()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: We have an AssertExtensions.SequenceEquals<T>(ROSpan<T>, ROSPan<T>)
for spans, if you don't like all the ToArray calls in your tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know for future changes. I might do that as a later cleanup. As for now, the current logic works and allows easily copying and tweaking logic between the 15-20 types that need it.
…r `GetShortestBitLength`
…e the one's complement format
This updates
GetShortestBitLength
to return anint
, exposes theWriteBigEndian
,WriteExponentBigEndian
,WriteSignificandBigEndian
APIs, and updatesSystem.Char
to explicitly implement the numeric interfaces.