Finish i128 support #62

dignifiedquire · 2018-07-21T15:57:12Z

Depends on #63 and #64

Implements BigDigit = u64 ~~behind a feature flag u64_digit~~ for all 64-bit targets with u128.

TODOs

find a way to make the new methods that take BigDigits for constructing BigUints private
Fix modpow, the last broken test.

Closes #40

cuviper · 2018-07-21T16:33:48Z

Thanks for taking this on! It will be a few days before I can review this in depth, but a few notes:

Running rustfmt is a good thing to do, but I definitely want that in a separate PR where it's clear that no code is actually changing. I recently ran cargo fmt on most of the other num crates, but I held off here because I didn't want to conflict with outstanding PRs.
Everything should use cfg(has_i128) rather than the feature, so auto-detection just works.
We need benchmark results, across multiple platforms -- at least 32-bit and 64-bit. I have a feeling we'll want to choose larger BigDigit only for some platforms. And even on 64-bit, I'm worried in particular that 128-bit division will be slow.
If we do use different sizes for different targets, it's probably best to decide that in the build script, and output another simple flag akin to has_i128.
I explicitly don't want methods with BigDigit in the public API. This was specifically cleaned up in 0.2 to prepare for supporting different sizes. IMO it's too much of a portability footgun for someone to write code that works fine on their main target, but becomes totally wrong on a different target.
I think it makes more sense to add the basic op support as its own PR, because that doesn't have as many questions or design decisions.

dignifiedquire · 2018-07-21T16:43:19Z

Thanks for the quick answer, I'll pull things into their own PRs, first one is here #63

dignifiedquire · 2018-07-23T18:20:49Z

I explicitly don't want methods with BigDigit in the public API. This was specifically cleaned up in 0.2 to prepare for supporting different sizes. IMO it's too much of a portability footgun for someone to write code that works fine on their main target, but becomes totally wrong on a different target.

I understand, but BigInt needs to be able to construct a BigUint from BigDigits, and they are not in the same module currently. So what would your preferred approach be for that?

dignifiedquire · 2018-07-23T18:21:14Z

Everything should use cfg(has_i128) rather than the feature, so auto-detection just works.

Done for all the operations in #64

dignifiedquire · 2018-07-23T18:23:11Z

If we do use different sizes for different targets, it's probably best to decide that in the build script, and output another simple flag akin to has_i128.

I have introduced a new feature u64_digit, to guard the switch for now. I was thinking it could be shipped with the default being disabled, and then gather benchmarks and feedback on how it is working for a while, before turning it on by default on some platforms.

cuviper · 2018-09-17T23:07:31Z

This needs a rebase after #63 and #64 landed. Were you ever able to improve the division performance, as we discussed in #40? If you could at least update the rest of what you've done, perhaps I can look at that too.

dignifiedquire · 2018-09-18T05:28:32Z

I'll do a rebase later today
I got modpow working, by switching to a different algorithm.
I got the perf only under control by using inline assembly, which you probably don't want to merge, so that will need another solution

cuviper · 2018-09-18T17:46:51Z

OK, no rush, I'm just trying to get around to things I've been neglecting. :)

dignifiedquire · 2018-11-07T12:55:26Z

@cuviper I rebased this, and it is passing tests, (except serialization with u64 digits, I missed that earlier, just marked as unimplemented for now)

When you have some time, it would be great if we can work through the remaining issues to figure out how to get this ready for merge.

cuviper · 2019-08-10T01:30:12Z

@dignifiedquire -- If you're still around, I think I've reconciled the performance losses well enough, and I've made quite a few other cleanups as well. A counter-review would be appreciated.

cc @maxbla -- since you've been hacking on num lately, maybe you'd like to review this too?

I have no doubt there are more cleanup opportunities, but I hope it's Good Enough that we can merge for the performance benefits at hand.

maxbla · 2019-08-11T06:52:14Z

I'd be happy take a look. A comprehensive review might take some time - this PR seems fairly substantial.

cuviper · 2019-08-11T17:34:28Z

I'll appreciate whatever level of detail you can manage, thanks!

cuviper · 2019-08-12T16:58:12Z

Rebased to resolve conflicts with the quickcheck PR.

maxbla

Tl;dr I goofed. The performance is the same between 'master' and 'u128' on 32-bit.

I think I've reconciled the performance losses well enough

I dug out my old 32-bit laptop to see if performance is the same between master and this branch. It seems kind of far off, but I might be missing something.
running cargo bench --all-features factorial_mul_biguint I get 1,195,380 ns/iter for master (uses normal cargo bench) and 2.1822ms/iter in this branch (uses criterion.rs). This was all on nightly Rust, as the old bencher only works on nightly rust. There's a lot going on here -- criterion seems to do some more complicated stuff (like "warming up for 3s" before benching).

How have you been comparing the performance of master to this branch and ~~did I do anything obviously wrong?~~ I was completely mistaken. I accidentally rand the bench on master from this repo, when I meant to run them on u128. The performance is completely in line - 1.197ms vs 1.195ms.

build.rs

src/algorithms.rs

src/bigint.rs

maxbla · 2019-09-03T18:09:52Z

src/bigint.rs

-        r
+        if let Some(other) = other.to_u32() {
+            self % other
+        } else if let Some(other) = other.to_i32() {


Since other is unsigned (because it is a BigInt), how could other.to_u32() return None but other.to_i32() return Some(_)?

i.e. I think you can delete lines 1877 and 1878

Are you thinking of BigUint? BigInt can be negative...

Yeah, you're right. I even retyped BigInt and it didn't click for me that that type is signed.

Since we're discussing this function, I suspect many instantiated BigInts (in the wild) will be small and further, many will be bigger than one word. My point is that the range 2.1 billion.. 4.3 billion is probably pretty uncommon, so it might make sense to just drop u32 branch entirely. Alternatively it might make sense to try to cast to bigdigit for the unsigned branch (is there an easy way to do that?)

The conversion checks should be relatively trivial, compared to the actual division. Rem<i32> even forwards to Rem<u32> with added sign handling, which is why it seems to make sense to deal with u32 directly first. I don't feel strongly, but it feels like a micro-optimization to worry about this much, where the real win is just in dividing by a primitive.

Maybe you'd like to focus on this for a followup performance PR?

Yeah, it would be silly to make this type of change without measuring. I am interested in doing some performance work, but I'm not exactly sure how to do it. It seems like num-bigint has used criterion.rs at some point? but I couldn't find it in the git log. I think criterion would probably be very helpful for optimizing.

src/monty.rs

cuviper · 2019-09-05T23:24:22Z

@maxbla Thanks for your review!

I noticed that you had comments for almost every file except biguint.rs, which has a lot of changes -- did you miss this? GitHub "helpfully" hides it:

Large diffs are not rendered by default.

src/biguint.rs

maxbla · 2019-09-06T20:05:10Z

src/biguint.rs

+///
+/// This is an internal `pub(crate)`-ish API only!
+#[inline]
+pub fn biguint_from_vec(digits: Vec<BigDigit>) -> BigUint {


This function is pub, but it requires a Vec<BigDigit>, so it effectively can't be used externally. The #[doc(hidden)] attribute would hide it form public documentation. There technically could be a name conflict if someone did use num_bigint::*, but that seems unlikely.

This is in the private mod biguint though, so it's truly inaccessible outside of the crate, as long as we don't accidentally re-export it elsewhere. It should be made pub(crate) whenever we raise our rustc minimum sufficiently though.

It seems like half my comments are due to oversight on my part :/

cuviper · 2020-01-13T21:16:30Z

OK, let's finally take the plunge in the upcoming 0.3 release! Thanks again @dignifiedquire for your initial work, and also @maxbla for helping with the review.

bors r+

62: Finish i128 support r=cuviper a=dignifiedquire Depends on #63 and #64 Implements `BigDigit = u64` ~~behind a feature flag `u64_digit`~~ for all 64-bit targets with `u128`. TODOs - [x] find a way to make the new methods that take `BigDigit`s for constructing `BigUint`s private - [x] Fix `modpow`, the last broken test. Closes #40 Co-authored-by: dignifiedquire <dignifiedquire@gmail.com> Co-authored-by: Josh Stone <cuviper@gmail.com>

bors · 2020-01-13T21:31:33Z

Build succeeded

continuous-integration/travis-ci/push

129: Fix the bit shifts at the end of Toom-3 r=cuviper a=cuviper This fixes a regression from 64-bit `BigDigit` (#62). Co-authored-by: Josh Stone <cuviper@gmail.com>

dignifiedquire force-pushed the u128 branch 2 times, most recently from 5ee8bff to 78b142c Compare July 22, 2018 21:25

dignifiedquire force-pushed the u128 branch from 3f85d57 to a49f0a2 Compare November 7, 2018 12:44

cuviper force-pushed the u128 branch from 329aa2f to f4af5ec Compare August 7, 2019 01:36

cuviper force-pushed the u128 branch from 3630bc9 to d96aaab Compare August 12, 2019 16:57