Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish i128 support #62

Merged
merged 40 commits into from
Jan 13, 2020
Merged

Finish i128 support #62

merged 40 commits into from
Jan 13, 2020

Conversation

dignifiedquire
Copy link
Contributor

@dignifiedquire dignifiedquire commented Jul 21, 2018

Depends on #63 and #64

Implements BigDigit = u64 behind a feature flag u64_digit for all 64-bit targets with u128.

TODOs

  • find a way to make the new methods that take BigDigits for constructing BigUints private
  • Fix modpow, the last broken test.

Closes #40

@cuviper
Copy link
Member

cuviper commented Jul 21, 2018

Thanks for taking this on! It will be a few days before I can review this in depth, but a few notes:

  • Running rustfmt is a good thing to do, but I definitely want that in a separate PR where it's clear that no code is actually changing. I recently ran cargo fmt on most of the other num crates, but I held off here because I didn't want to conflict with outstanding PRs.

  • Everything should use cfg(has_i128) rather than the feature, so auto-detection just works.

  • We need benchmark results, across multiple platforms -- at least 32-bit and 64-bit. I have a feeling we'll want to choose larger BigDigit only for some platforms. And even on 64-bit, I'm worried in particular that 128-bit division will be slow.

  • If we do use different sizes for different targets, it's probably best to decide that in the build script, and output another simple flag akin to has_i128.

  • I explicitly don't want methods with BigDigit in the public API. This was specifically cleaned up in 0.2 to prepare for supporting different sizes. IMO it's too much of a portability footgun for someone to write code that works fine on their main target, but becomes totally wrong on a different target.

  • I think it makes more sense to add the basic op support as its own PR, because that doesn't have as many questions or design decisions.

@dignifiedquire
Copy link
Contributor Author

Thanks for the quick answer, I'll pull things into their own PRs, first one is here #63

@dignifiedquire dignifiedquire force-pushed the u128 branch 2 times, most recently from 5ee8bff to 78b142c Compare July 22, 2018 21:25
@dignifiedquire
Copy link
Contributor Author

I explicitly don't want methods with BigDigit in the public API. This was specifically cleaned up in 0.2 to prepare for supporting different sizes. IMO it's too much of a portability footgun for someone to write code that works fine on their main target, but becomes totally wrong on a different target.

I understand, but BigInt needs to be able to construct a BigUint from BigDigits, and they are not in the same module currently. So what would your preferred approach be for that?

@dignifiedquire
Copy link
Contributor Author

Everything should use cfg(has_i128) rather than the feature, so auto-detection just works.

Done for all the operations in #64

@dignifiedquire
Copy link
Contributor Author

If we do use different sizes for different targets, it's probably best to decide that in the build script, and output another simple flag akin to has_i128.

I have introduced a new feature u64_digit, to guard the switch for now. I was thinking it could be shipped with the default being disabled, and then gather benchmarks and feedback on how it is working for a while, before turning it on by default on some platforms.

@cuviper
Copy link
Member

cuviper commented Sep 17, 2018

This needs a rebase after #63 and #64 landed. Were you ever able to improve the division performance, as we discussed in #40? If you could at least update the rest of what you've done, perhaps I can look at that too.

@dignifiedquire
Copy link
Contributor Author

  • I'll do a rebase later today
  • I got modpow working, by switching to a different algorithm.
  • I got the perf only under control by using inline assembly, which you probably don't want to merge, so that will need another solution

@cuviper
Copy link
Member

cuviper commented Sep 18, 2018

OK, no rush, I'm just trying to get around to things I've been neglecting. :)

@dignifiedquire
Copy link
Contributor Author

@cuviper I rebased this, and it is passing tests, (except serialization with u64 digits, I missed that earlier, just marked as unimplemented for now)

When you have some time, it would be great if we can work through the remaining issues to figure out how to get this ready for merge.

@cuviper
Copy link
Member

cuviper commented Aug 10, 2019

@dignifiedquire -- If you're still around, I think I've reconciled the performance losses well enough, and I've made quite a few other cleanups as well. A counter-review would be appreciated.

cc @maxbla -- since you've been hacking on num lately, maybe you'd like to review this too?

I have no doubt there are more cleanup opportunities, but I hope it's Good Enough that we can merge for the performance benefits at hand.

@maxbla
Copy link
Contributor

maxbla commented Aug 11, 2019

I'd be happy take a look. A comprehensive review might take some time - this PR seems fairly substantial.

@cuviper
Copy link
Member

cuviper commented Aug 11, 2019

I'll appreciate whatever level of detail you can manage, thanks!

@cuviper
Copy link
Member

cuviper commented Aug 12, 2019

Rebased to resolve conflicts with the quickcheck PR.

Copy link
Contributor

@maxbla maxbla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tl;dr I goofed. The performance is the same between 'master' and 'u128' on 32-bit.

I think I've reconciled the performance losses well enough

I dug out my old 32-bit laptop to see if performance is the same between master and this branch. It seems kind of far off, but I might be missing something.
running cargo bench --all-features factorial_mul_biguint I get 1,195,380 ns/iter for master (uses normal cargo bench) and 2.1822ms/iter in this branch (uses criterion.rs). This was all on nightly Rust, as the old bencher only works on nightly rust. There's a lot going on here -- criterion seems to do some more complicated stuff (like "warming up for 3s" before benching).

How have you been comparing the performance of master to this branch and did I do anything obviously wrong? I was completely mistaken. I accidentally rand the bench on master from this repo, when I meant to run them on u128. The performance is completely in line - 1.197ms vs 1.195ms.

build.rs Show resolved Hide resolved
src/algorithms.rs Outdated Show resolved Hide resolved
src/algorithms.rs Outdated Show resolved Hide resolved
src/bigint.rs Show resolved Hide resolved
r
if let Some(other) = other.to_u32() {
self % other
} else if let Some(other) = other.to_i32() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since other is unsigned (because it is a BigInt), how could other.to_u32() return None but other.to_i32() return Some(_)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. I think you can delete lines 1877 and 1878

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you thinking of BigUint? BigInt can be negative...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. I even retyped BigInt and it didn't click for me that that type is signed.

Since we're discussing this function, I suspect many instantiated BigInts (in the wild) will be small and further, many will be bigger than one word. My point is that the range 2.1 billion.. 4.3 billion is probably pretty uncommon, so it might make sense to just drop u32 branch entirely. Alternatively it might make sense to try to cast to bigdigit for the unsigned branch (is there an easy way to do that?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion checks should be relatively trivial, compared to the actual division. Rem<i32> even forwards to Rem<u32> with added sign handling, which is why it seems to make sense to deal with u32 directly first. I don't feel strongly, but it feels like a micro-optimization to worry about this much, where the real win is just in dividing by a primitive.

Maybe you'd like to focus on this for a followup performance PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would be silly to make this type of change without measuring. I am interested in doing some performance work, but I'm not exactly sure how to do it. It seems like num-bigint has used criterion.rs at some point? but I couldn't find it in the git log. I think criterion would probably be very helpful for optimizing.

src/monty.rs Outdated Show resolved Hide resolved
src/monty.rs Show resolved Hide resolved
@cuviper
Copy link
Member

cuviper commented Sep 5, 2019

@maxbla Thanks for your review!

I noticed that you had comments for almost every file except biguint.rs, which has a lot of changes -- did you miss this? GitHub "helpfully" hides it:

Large diffs are not rendered by default.

src/biguint.rs Show resolved Hide resolved
///
/// This is an internal `pub(crate)`-ish API only!
#[inline]
pub fn biguint_from_vec(digits: Vec<BigDigit>) -> BigUint {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is pub, but it requires a Vec<BigDigit>, so it effectively can't be used externally. The #[doc(hidden)] attribute would hide it form public documentation. There technically could be a name conflict if someone did use num_bigint::*, but that seems unlikely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the private mod biguint though, so it's truly inaccessible outside of the crate, as long as we don't accidentally re-export it elsewhere. It should be made pub(crate) whenever we raise our rustc minimum sufficiently though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like half my comments are due to oversight on my part :/

@cuviper cuviper mentioned this pull request Jan 10, 2020
7 tasks
@cuviper
Copy link
Member

cuviper commented Jan 13, 2020

OK, let's finally take the plunge in the upcoming 0.3 release! Thanks again @dignifiedquire for your initial work, and also @maxbla for helping with the review.

bors r+

bors bot added a commit that referenced this pull request Jan 13, 2020
62: Finish i128 support r=cuviper a=dignifiedquire

Depends on #63 and #64 

Implements `BigDigit = u64` ~~behind a feature flag `u64_digit`~~ for all 64-bit targets with `u128`.

TODOs
- [x] find a way to make the new methods that take `BigDigit`s for constructing `BigUint`s private
- [x] Fix `modpow`, the last broken test.

Closes #40 


Co-authored-by: dignifiedquire <dignifiedquire@gmail.com>
Co-authored-by: Josh Stone <cuviper@gmail.com>
@bors
Copy link
Contributor

bors bot commented Jan 13, 2020

Build succeeded

@bors bors bot merged commit 69267c5 into rust-num:master Jan 13, 2020
bors bot added a commit that referenced this pull request Jan 14, 2020
129: Fix the bit shifts at the end of Toom-3 r=cuviper a=cuviper

This fixes a regression from 64-bit `BigDigit` (#62).

Co-authored-by: Josh Stone <cuviper@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for 128-bit primitives
3 participants