Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Platform-independent random numbers [using stdlib copy] #469

Merged
merged 7 commits into from
Aug 14, 2019
Merged

Platform-independent random numbers [using stdlib copy] #469

merged 7 commits into from
Aug 14, 2019

Conversation

halflearned
Copy link
Member

Third -- and hopefully final -- attempt at #379.

In our last meeting, we decided that we should try once again to copy from boost a way that 1) would not bloat the third party dependencies with hundreds of boost files, but also 2) be easily reproducible. Unfortunately, I wasn't able to achieve both goals simultaneously. Boost has a very complex type system, and I wasn't able to extricate the necessary functions without bringing a few hundred boost files along.

Instead, I decided to take a final look at standard library's <random> and noticed that most of the dependencies came from operator<<. Having removed those methods, it was not hard to pull the necessary classes from the standard library.

Major changes

  • Two new files in third_party: grfstd/algorithm.hpp and grfstd/random.hpp, which are nearly exact copies of the relevant parts of algorithm and random headers, with minor differences such as adding prefix std:: to some of the function calls.

  • Calls to std::shuffle and std::*_distribution are now grfstd::shuffle and grfstd::*_distribution.

@halflearned halflearned changed the title Replacing std -> grfstd Platform-independent random numbers [using stdlib copy] Aug 4, 2019
Copy link
Member

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is looking great, the approach is very clean. A couple small comments, then I think it is ready to merge:

  • I think the common naming convention is nonstd for something from the standard library that's been copied or modified. This is the namespace for the implementation of 'optional' we copied in, for example. We could rename the grfstd directory to random to distinguish it from optional.
  • It would be good to add a README.md to the directory stating how the code was pulled together and why it exists. Important information would include what compiler's standard library was used, and also what modifications were made to the files (I think it is just removing methods related to operator<< as well as uses of _LIBCPP macros?)

@halflearned
Copy link
Member Author

Thanks for the review, @jtibshirani! I'm also pretty happy with how this turned out.

On nonstd: that's a good point, I should have checked what was done for optional.

On the README: will do.

I'm a little tied up today and tomorrow, but by Friday I should have this done.

@halflearned
Copy link
Member Author

Thank for the review, @jtibshirani! It should be ready for a second round whenever you have the time.

Copy link
Member

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

After this merges, I plan to enable the characterization tests in CI. Having them enabled in CI is nice on its own, and will also serve as a sort of test for this PR.

core/third_party/random/README.md Outdated Show resolved Hide resolved
core/third_party/random/README.md Show resolved Hide resolved
@halflearned halflearned merged commit 93ce8d3 into grf-labs:master Aug 14, 2019
@halflearned halflearned deleted the random-grfstd branch August 14, 2019 01:27
erikcs added a commit to erikcs/grf that referenced this pull request Aug 16, 2019
* master:
  Fix a typo in the `@keywords` roxygen directive. (grf-labs#481)
  Ensuring DiceKriging does not break forest training (grf-labs#455)
  CI: Disable lintr (grf-labs#480)
  Platform-independent random numbers [using stdlib copy] (grf-labs#469)
  test_regression_forest.R: fix argument name (grf-labs#477)
  CI: Only warn on lint error (grf-labs#474)
jtibshirani added a commit that referenced this pull request Aug 18, 2019
Previously, the C++ characterization tests only passed when using clang because
of differences in the way random numbers are generated across platforms.
Because we build on a couple different platforms, we had to disable these tests
in CI.

Now that we've added platform-independent random number generation in #469, we
can enable the characterization tests.
jtibshirani added a commit that referenced this pull request Aug 19, 2019
Previously, the C++ characterization tests only passed when using clang because
of differences in the way random numbers are generated across platforms.
Because we build on a couple different platforms, we had to disable these tests
in CI.

Now that we've added platform-independent random number generation in #469, we
can enable the characterization tests.
jtibshirani added a commit that referenced this pull request Aug 21, 2019
Previously, the C++ characterization tests only passed when using clang because
of differences in the way random numbers are generated across platforms.
Because we build on a couple different platforms, we had to disable these tests
in CI.

Now that we've added platform-independent random number generation in #469, we
can enable the characterization tests.
jtibshirani added a commit that referenced this pull request Aug 21, 2019
Previously, the C++ characterization tests only passed when using clang because
of differences in the way random numbers are generated across platforms.
Because we build on a couple different platforms, we had to disable these tests
in CI.

Now that we've added platform-independent random number generation in #469 and
#492, we can enable the characterization tests.
davidahirshberg pushed a commit to davidahirshberg/grf that referenced this pull request Dec 6, 2019
davidahirshberg pushed a commit to davidahirshberg/grf that referenced this pull request Dec 6, 2019
Previously, the C++ characterization tests only passed when using clang because
of differences in the way random numbers are generated across platforms.
Because we build on a couple different platforms, we had to disable these tests
in CI.

Now that we've added platform-independent random number generation in grf-labs#469 and
grf-labs#492, we can enable the characterization tests.
@erikcs
Copy link
Member

erikcs commented Jul 9, 2021

Making a note of that:

After #1006 core GRF is tested to produce the same result with three different compilers (latest version), using the same seed and number of threads on

  • GCC
  • Clang
  • MSVC

Using Intel's C++ compiler ICC does not give the exact same results, but close. Evidently its compiler optimizations are more "aggressive", if they are set to -O0 or -O1 the ForestCharacterizationTest matches exactly on the linux machine I tried out ICC 19.0.0.117 on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants