Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite integer formatting #12382

Merged
merged 3 commits into from
Feb 22, 2014
Merged

Rewrite integer formatting #12382

merged 3 commits into from
Feb 22, 2014

Conversation

brendanzab
Copy link
Member

This is PR is the beginning of a complete rewrite and ultimate removal of the std::num::strconv module (see #6220), and the removal of the ToStrRadix trait in favour of using the std::fmt functionality directly. This should make for a cleaner API, encourage less allocation, and make the implementation more comprehensible .

The Formatter::{pad_integral, with_padding} methods have also been refactored make things easier to understand.

The formatting tests for integers have been moved out of run-pass/ifmt.rs in order to provide more immediate feedback when building using make check-stage2-std NO_REBUILD=1.

Arbitrary radixes are now easier to use in format strings. For example:

assert_eq!(format!("{:04}", radix(3, 2)), ~"0011");

The benchmarks have been standardised between std::num::strconv and std::num::fmt to make it easier to compare the performance of the different implementations.

 type | radix | std::num::strconv      | std::num::fmt
======|=======|========================|======================
 int  | bin   | 1748 ns/iter (+/- 150) | 321 ns/iter (+/- 25)
 int  | oct   |  706 ns/iter (+/- 53)  | 179 ns/iter (+/- 22)
 int  | dec   |  640 ns/iter (+/- 59)  | 207 ns/iter (+/- 10)
 int  | hex   |  637 ns/iter (+/- 77)  | 205 ns/iter (+/- 19)
 int  | 36    |  446 ns/iter (+/- 30)  | 309 ns/iter (+/- 20)
------|-------|------------------------|----------------------
 uint | bin   | 1724 ns/iter (+/- 159) | 322 ns/iter (+/- 13)
 uint | oct   |  663 ns/iter (+/- 25)  | 175 ns/iter (+/- 7)
 uint | dec   |  613 ns/iter (+/- 30)  | 186 ns/iter (+/- 6)
 uint | hex   |  519 ns/iter (+/- 44)  | 207 ns/iter (+/- 20)
 uint | 36    |  418 ns/iter (+/- 16)  | 308 ns/iter (+/- 32)

@@ -25,10 +25,12 @@ let y: int = x.unwrap();
Use [`ToStrRadix`](http://static.rust-lang.org/doc/master/std/num/trait.ToStrRadix.html).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This text?

@brendanzab
Copy link
Member Author

r? @alexcrichton

use vec::{ImmutableVector, MutableVector};

/// A type that represents a specific radix
pub trait Radix {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat trick!

However, it doesn't seem to support arbitrary bases. What about these being normal methods fn base(&self) -> u8 (etc.) and having

pub struct Binary;
impl Radix for Binary { fn base(&self) -> u8 { 2 } /* ... */ }
pub struct Octal;
// ...

pub struct GenericRadix { priv base: u8 }
impl Radix for GenerixRadix { fn radix(&self) { self.base } /* ... */ }

which would give the best of both worlds: static dispatch for the common cases but still enough flexibility to handle arbitrary bases. (And also allows for reducing code-bloat via choosing to use the GenericRadix version over monomorphising for the others, similar to a &Radix trait object, but without the extra cost of virtual calls.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey that's a cool idea. Will do.

@brson
Copy link
Contributor

brson commented Feb 19, 2014

Sweet perf wins. 🚀

}

// The bytes were accumulated in reverse order, so write
// them back-to-front.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a bit faster if you accumulate the digits backwards starting at the end of the buffer. That way you only need to make one write call to the underlying Writer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WOAH. Mega improvement. Writing an int with a decimal radix now takes 204 ns, down from 640 ns with strconv.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! (Remember to update the PR/commit messages before it lands :) )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@zkamsler
Copy link
Contributor

Cool stuff. It reminds me that I ought to dust off the floating point formatting code that I had written a few weeks ago.

@brendanzab
Copy link
Member Author

Yeah, I'm looking at the Dragon4 (Steele) and Grisu3 (Loitsch) algorithms when it comes to formatting floating point. But you're welcome to look at it too. It's much more complicated than the integer stuff :)

@brendanzab
Copy link
Member Author

@alexcrichton Ok, I think I'm done.

if_ok!(write_prefix(self)); self.buf.write(buf)
}
// If we're under the maximum width, check if we're over the minimum
// width, if so then we can also write bytes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems copied from below, there seems to be no check against the "maximum width" which I think in the case of integers is precision.

@alexcrichton
Copy link
Member

This is super awesome. Perf wins, cleanups, hurrah!

I'd like to bikeshed the interface for a brief amount of time if that's ok. Some thoughts I have:

  • num::fmt => fmt::num?
  • These Radix structs have a bit of an odd api. It may be nicer to have clearly defined top-level functions inside of the module which use the Radix structures/traits internally. Basically it would be nice to avoid exposing one more trait.

What do you think?

@brendanzab
Copy link
Member Author

Ok, updated the API and moved it to std::fmt::num.

You can now do:

assert_eq!(format!("{:04}", radix(3, 2)), ~"0011");

I haven't made the other things private, because I still think its useful to be able to write the numbers via a Formatter directly.

/// # Example
///
/// ~~~
/// # use std::num::fmt::Radix;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use may need to be updated. I think it'd be useful to show it in the documentation as well.

Could you also expand the documentation a bit about how the value can be passed to format! to format the specified integer with the specified base?

}

#[test]
fn test_format_int_overflow() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this test? Would it be clear to be using i8::MIN (etc.) directly rather than disguising the logic behind overflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to test that the signs are correctly formatted before and after the padding. I don't see how overflow is happening here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, sorry, was reading the wrong test. It's to test that the two's complement thing works ok. Which I was bitten by before.

This works towards a complete rewrite and ultimate removal of the `std::num::strconv` module (see rust-lang#6220), and the removal of the `ToStrRadix` trait in favour of using the `std::fmt` functionality directly. This should make for a cleaner API, encourage less allocation, and make the implementation far more comprehensible.

The `Formatter::pad_integral` method has also been refactored make it easier to understand.

The formatting tests for integers have been moved out of `run-pass/ifmt.rs` in order to provide more immediate feedback when building using `make check-stage2-std NO_REBUILD=1`.

The benchmarks have been standardised between std::num::strconv and std::num::fmt to make it easier to compare the performance of the different implementations.

Arbitrary radixes are now easier to use in format strings. For example:

~~~
assert_eq!(format!("{:04}", radix(3, 2)), ~"0011");
~~~
This is in preparation to remove the implementations of ToStrRadix in integers, and to remove the associated logic from `std::num::strconv`.

The parts that still need to be liberated are:

- `std::fmt::Formatter::runplural`
- `num::{bigint, complex, rational}`
@brendanzab
Copy link
Member Author

@alexcrichton rebased

bors added a commit that referenced this pull request Feb 22, 2014
This is PR is the beginning of a complete rewrite and ultimate removal of the `std::num::strconv` module (see #6220), and the removal of the `ToStrRadix` trait in favour of using the `std::fmt` functionality directly. This should make for a cleaner API, encourage less allocation, and make the implementation more comprehensible .

The `Formatter::{pad_integral, with_padding}` methods have also been refactored make things easier to understand.

The formatting tests for integers have been moved out of `run-pass/ifmt.rs` in order to provide more immediate feedback when building using `make check-stage2-std NO_REBUILD=1`.

Arbitrary radixes are now easier to use in format strings. For example:

~~~rust
assert_eq!(format!("{:04}", radix(3, 2)), ~"0011");
~~~

The benchmarks have been standardised between `std::num::strconv` and `std::num::fmt` to make it easier to compare the performance of the different implementations.

~~~
 type | radix | std::num::strconv      | std::num::fmt
======|=======|========================|======================
 int  | bin   | 1748 ns/iter (+/- 150) | 321 ns/iter (+/- 25)
 int  | oct   |  706 ns/iter (+/- 53)  | 179 ns/iter (+/- 22)
 int  | dec   |  640 ns/iter (+/- 59)  | 207 ns/iter (+/- 10)
 int  | hex   |  637 ns/iter (+/- 77)  | 205 ns/iter (+/- 19)
 int  | 36    |  446 ns/iter (+/- 30)  | 309 ns/iter (+/- 20)
------|-------|------------------------|----------------------
 uint | bin   | 1724 ns/iter (+/- 159) | 322 ns/iter (+/- 13)
 uint | oct   |  663 ns/iter (+/- 25)  | 175 ns/iter (+/- 7)
 uint | dec   |  613 ns/iter (+/- 30)  | 186 ns/iter (+/- 6)
 uint | hex   |  519 ns/iter (+/- 44)  | 207 ns/iter (+/- 20)
 uint | 36    |  418 ns/iter (+/- 16)  | 308 ns/iter (+/- 32)
~~~
@bors bors closed this Feb 22, 2014
@bors bors merged commit 6943acd into rust-lang:master Feb 22, 2014
@brendanzab brendanzab deleted the fmt-int branch February 22, 2014 19:29
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 25, 2022
internal: Make autoclosing angle brackets configurable, disabled by default

cc rust-lang/rust-analyzer#12379
flip1995 pushed a commit to flip1995/rust that referenced this pull request Mar 7, 2024
Bump macOS CI version to 13

I looked into the random failures we've been getting on macOS CI, not entirely sure what is going on but it seems to be an issue out of our department. A [plain driver](https://github.com/Alexendoo/rust-clippy/blob/314425001df19a82c295952b26664704d06e9348/src/driver.rs) being ran on the test files [without `ui_test` or any special flags](https://github.com/Alexendoo/rust-clippy/blob/314425001df19a82c295952b26664704d06e9348/src/driver.rs) hit the same issue

It didn't occur on `macos-13` the few times I tried it though so let's upgrade to that. [The current `macos-latest` refers to `macos-12`](https://github.com/actions/runner-images?tab=readme-ov-file#available-images), later this year it will refer to `macos-14` which runs on `aarch64` so specifying the version for our x64 macOS tests will also save a future headache

r? `@flip1995`

changelog: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants