Update BenchmarksGame benchmarks to latest #13994

JosephTremoulet · 2017-09-14T20:08:54Z

Previously, for each benchmark, we had a copy of what was the fastest serialized implementation (since the mechanics of tracking these and spotting regressions/improvements through noise are easier than parallel ones). Update them all to reflect current standings, and to include the best overall for each as well as the best serial implementation for each. Test names are suffixed to match the indices used to distinguish them on the Benchmarks Game website.

JosephTremoulet · 2017-09-14T20:09:03Z

@dotnet/jit-contrib

JosephTremoulet · 2017-09-14T20:11:59Z

@dotnet/rap-team

jorive · 2017-09-14T21:03:01Z

@dotnet-bot test Windows_NT x64 perf
@dotnet-bot test Windows_NT x86 perf
@dotnet-bot test linux perf flow

JosephTremoulet · 2017-09-14T21:28:50Z

@dotnet-bot test Ubuntu x64 Checked Build and Test (failure was an I/O error)

AndyAyersMS

Would it make sense to record the version numbers of each variant from their source control system? Also maybe note the URL for it somewhere?

Also have you looked at per-iteration times to see if they're reasonable?

AndyAyersMS · 2017-09-14T23:27:54Z

tests/src/JIT/Performance/CodeQuality/BenchmarksGame/reverse-complement/harness-helpers.cs

+// See the LICENSE file in the project root for more information.
+
+// Helper functionality to locate inputs and find outputs for
+// k-nucleotide benchmark in CoreCLR test harness


Typo: "for reverse-complement"

AndyAyersMS · 2017-09-14T23:28:45Z

tests/src/JIT/Performance/CodeQuality/BenchmarksGame/reverse-complement/reverse-complement-6.cs

+namespace BenchmarksGame
+{
+    class RevCompSequence { public List<byte[]> Pages; public int StartHeader, EndExclusive; public Thread ReverseThread; }
+


Formatting looks strange here.

AndyAyersMS · 2017-09-14T23:31:49Z

tests/src/JIT/Performance/CodeQuality/BenchmarksGame/regex-redux/harness-helpers.cs

+// See the LICENSE file in the project root for more information.
+
+// Helper functionality to locate inputs and find outputs for
+// k-nucleotide benchmark in CoreCLR test harness


Typo "For regex-redux"

AndyAyersMS · 2017-09-14T23:32:52Z

tests/src/JIT/Performance/CodeQuality/BenchmarksGame/fasta/fasta-1.cs

+        }
+
+        static Frequency[] IUB = {
+    new Frequency ('a', 0.27)


Formatting looks odd here

AndyAyersMS · 2017-09-14T23:33:05Z

tests/src/JIT/Performance/CodeQuality/BenchmarksGame/fasta/fasta-1.cs

+};
+
+        static Frequency[] HomoSapiens = {
+    new Frequency ('a', 0.3029549426680)


Formatting looks odd here

jorive · 2017-09-15T00:33:19Z

tests/src/JIT/Performance/CodeQuality/BenchmarksGame/reverse-complement/reverse-complement-6.cs

+
+namespace BenchmarksGame
+{
+    class RevCompSequence { public List<byte[]> Pages; public int StartHeader, EndExclusive; public Thread ReverseThread; }


class RevCompSequence { public List<byte[]> Pages; public int StartHeader, EndExclusive; public Thread ReverseThread; } [](start = 4, length = 119)

On vs or vscode, Ctrl+A, Ctrl+K+F

JosephTremoulet · 2017-09-15T15:27:17Z

Updated with formatting/typo updates and links to BenchmarksGame CVS repo w/ revision numbers.
Also added some diagnostic output to help determine why the OS X run is failing to find the input files.

JosephTremoulet · 2017-09-15T15:29:58Z

Also have you looked at per-iteration times to see if they're reasonable?

Yes, I set InnerIterationCount for each benchmark to make the reported duration ~1000ms on my dev box.

ViktorHofer · 2017-09-15T15:49:54Z

I didn't know that we store the benchmarks from BenchmarkGame in coreclr. Is that to improve our internal benchmarks and/or to test possible improvements to later submit to http://benchmarksgame.alioth.debian.org/?

JosephTremoulet · 2017-09-15T15:56:09Z

@ViktorHofer it's so that our internal benchmarking will track them, and so we can use that to track the effect of compiler/library/runtime changes on them. They're not intended to diverge from the sources on the official site, except where needed to run in our benchmarking system. They might make a convenient place to test out possible source improvements, but we wouldn't check in such a change; instead, after it was submitted to the official site and became the new best variant, we'd pull that new variant down and retire the old one from our suite.

ViktorHofer · 2017-09-15T16:22:31Z

I see, that makes sense! I'm currently looking at those benchmarks and trying to find optimization paths, that's why I asked.

Is there a special reason why you number them e.g. the regex-redux with 1 and 5? why not just name them serial and parallel?

JosephTremoulet · 2017-09-15T16:33:48Z

Is there a special reason why you number them e.g. the regex-redux with 1 and 5? why not just name them serial and parallel?

From the perspective of using these to track the performance impact of compiler/library/runtime changes, looking at historical trendlines it's most helpful if they represent fixed sources. If we went with the serial/parallel naming scheme, then when we update to a new fastest variant, there'd be a jump in the historical trendline to stumble over. So I went the route of using the numbers for the names and indicating serial/parallel in comments. Less importantly, this also meant not having to figure out what to name a benchmark when the fastest implementation is a serial one.

ViktorHofer · 2017-09-15T16:45:29Z

For historical trendlines it definitely makes sense. I was just thinking about keeping the names to reduce noise when the benchmarks change. But the historical argument is definitely stronger 👍

For each benchmark, grab the current best C# .NET entry, and also grab the current best serial implementation (since these are easier to work with from the benchmarking perspective).

Also insert namespace BenchmarksGame.

- Add result validation - Add [Benchmark] attributes and appropriate iteration counts - Minor edits here and there to target .NET Standard 1.4 - Exception: pi-digits rewritten to use managed BitInteger instead of p/invoke out to GMP.

Name each variant after its index on the site, not its comparative status.

This will make it easier to track changes in the future.

Auto-formatting was leaving some new array expressions oddly indented.

JosephTremoulet · 2017-09-15T19:37:21Z

OS X passing with latest update, so I'll kick off those perf tests again:

@dotnet-bot test Windows_NT x64 perf
@dotnet-bot test Windows_NT x86 perf
@dotnet-bot test linux perf flow

JosephTremoulet · 2017-09-15T20:43:08Z

The Windows_NT arm Cross Checked Build and Test failure is happening because I've removed the old tests but the test script is still trying to run them. The Tests.lst file that lists the old tests has a big comment at the top saying that it's auto-generated...

What do I need to to do re-generate the file?
Why doesn't it get re-generated as part of running the CI?
Failing that, why doesn't the comment telling me the file is auto-generated tell me how to re-generate it?

not sure to whom I should be directing these questions... @jashook?

BruceForstall · 2017-09-15T21:29:58Z

@JosephTremoulet It's auto-generated by @jashook, manually. You can (temporarily) change the tests.lst file to cause the removed tests not to run by changing the Categories to:

Categories=EXPECTED_FAIL

The next update to the Tests.lst files will need to include the new variants of these tests.

JosephTremoulet · 2017-09-16T02:50:56Z

Thanks, @BruceForstall. Updated per your suggestion.

@dotnet-bot test Windows_NT x64 perf
@dotnet-bot test Windows_NT x86 perf
@dotnet-bot test linux perf flow

JosephTremoulet · 2017-09-16T02:51:24Z

@AndyAyersMS, good with updates?

AndyAyersMS · 2017-09-16T04:39:57Z

Yep, thanks for the changes.

Change dotnet#13994 moved some tests that were excluded from Helix runs, but failed to update the exclusion list; fix that oversight and exclude the tests in their new locations.

Change dotnet#13994 moved some tests that were excluded from Helix runs, but failed to update the exclusion list; fix that oversight and exclude the tests in their new locations. Fixes #14034.

For each benchmark, grab the current best C# .NET entry, and also grab the current best serial implementation (since these are easier to work with from the benchmarking perspective). (cherry picked from commit 4d9e8b5) ** Apply default VS formatting Also insert namespace BenchmarksGame. (cherry picked from commit d0099ff) ** Modify benchmarks to run in perf test harness - Add result validation - Add [Benchmark] attributes and appropriate iteration counts - Minor edits here and there to target .NET Standard 1.4 - Exception: pi-digits rewritten to use managed BitInteger instead of p/invoke out to GMP. (cherry picked from commit e055116) ** Remove old versions of BenchmarksGame benchmarks (cherry picked from commit 9a8151f) ** Rename BenchmarksGame files Name each variant after its index on the site, not its comparative status. (cherry picked from commit 087f903) ** Add references to source CVS This will make it easier to track changes in the future. (cherry picked from commit be6743d) ** Manual formatting adjustments Auto-formatting was leaving some new array expressions oddly indented. (cherry picked from commit 0dcaa77) ** Update BenchmarksGames README.txt Reflecting recent updates to the snapshot of these tests. (cherry picked from commit 3bb67e9) ** Fix expected values in fannkuch-redux-5 The validation logic was testing against `chksum`, which actually can vary depending on the number of processors (as that is used to determine the number of threads across which the work is partitioned, and the checksum is sensitive to the bucketing). Change it to test against `maxflips` instead, which is stable. Fixes #14040. (cherry picked from commit 307188e) ** Update exclusions for moved tests Change dotnet#13994 moved some tests that were excluded from Helix runs, but failed to update the exclusion list; fix that oversight and exclude the tests in their new locations. Fixes #14034. (cherry picked from commit 13df954)

For each benchmark, grab the current best C# .NET entry, and also grab the current best serial implementation (since these are easier to work with from the benchmarking perspective). (cherry picked from commit 4d9e8b5) ** Apply default VS formatting Also insert namespace BenchmarksGame. (cherry picked from commit d0099ff) ** Modify benchmarks to run in perf test harness - Add result validation - Add [Benchmark] attributes and appropriate iteration counts - Minor edits here and there to target .NET Standard 1.4 - Exception: pi-digits rewritten to use managed BitInteger instead of p/invoke out to GMP. (cherry picked from commit e055116) ** Remove old versions of BenchmarksGame benchmarks (cherry picked from commit 9a8151f) ** Rename BenchmarksGame files Name each variant after its index on the site, not its comparative status. (cherry picked from commit 087f903) ** Add references to source CVS This will make it easier to track changes in the future. (cherry picked from commit be6743d) ** Manual formatting adjustments Auto-formatting was leaving some new array expressions oddly indented. (cherry picked from commit 0dcaa77) ** Update BenchmarksGames README.txt Reflecting recent updates to the snapshot of these tests. (cherry picked from commit 3bb67e9) ** Fix expected values in fannkuch-redux-5 The validation logic was testing against `chksum`, which actually can vary depending on the number of processors (as that is used to determine the number of threads across which the work is partitioned, and the checksum is sensitive to the bucketing). Change it to test against `maxflips` instead, which is stable. Fixes #14040. (cherry picked from commit 307188e) ** Update exclusions for moved tests Change dotnet#13994 moved some tests that were excluded from Helix runs, but failed to update the exclusion list; fix that oversight and exclude the tests in their new locations. Fixes #14034. (cherry picked from commit 13df954) ** Reset static state per iteration for k-nucleotide-9 (dotnet#14081) Otherwise iterations keep getting slower and slower. Also bump inner iteration count to 10 to restore the nominal one second duration per iteration. (cherry picked from commit fd1000c)

It was removed in dotnet#13994 when these benchmarks were updated, expecting a tests.lst autogeneration would follow. I'm adding them manually now to ensure we have test coverage for these. Fixes #15503

JosephTremoulet requested a review from AndyAyersMS September 14, 2017 20:08

dnfclas added the cla-already-signed label Sep 14, 2017

AndyAyersMS reviewed Sep 14, 2017

View reviewed changes

jorive reviewed Sep 15, 2017

View reviewed changes

danmoseley requested a review from ViktorHofer September 15, 2017 02:58

JosephTremoulet force-pushed the BenchmarksGame branch from 90eb3ff to a59062b Compare September 15, 2017 15:24

JosephTremoulet added 2 commits September 15, 2017 13:28

Update BenchmarksGame benchmarks to latest

4d9e8b5

For each benchmark, grab the current best C# .NET entry, and also grab the current best serial implementation (since these are easier to work with from the benchmarking perspective).

Apply default VS formatting

d0099ff

Also insert namespace BenchmarksGame.

JosephTremoulet force-pushed the BenchmarksGame branch from a59062b to 4dd072f Compare September 15, 2017 17:29

JosephTremoulet added 5 commits September 15, 2017 13:46

Modify benchmarks to run in perf test harness

e055116

- Add result validation - Add [Benchmark] attributes and appropriate iteration counts - Minor edits here and there to target .NET Standard 1.4 - Exception: pi-digits rewritten to use managed BitInteger instead of p/invoke out to GMP.

Remove old versions of BenchmarksGame benchmarks

9a8151f

Rename BenchmarksGame files

087f903

Name each variant after its index on the site, not its comparative status.

Add references to source CVS

be6743d

This will make it easier to track changes in the future.

Manual formatting adjustments

0dcaa77

Auto-formatting was leaving some new array expressions oddly indented.

JosephTremoulet force-pushed the BenchmarksGame branch from 4dd072f to 0dcaa77 Compare September 15, 2017 17:47

Mark removed tests EXPECTED_FAIL

07e93a1

The next update to the Tests.lst files will need to include the new variants of these tests.

AndyAyersMS approved these changes Sep 16, 2017

View reviewed changes

JosephTremoulet merged commit cd6b92a into dotnet:master Sep 16, 2017

JosephTremoulet deleted the BenchmarksGame branch September 16, 2017 12:39

JosephTremoulet mentioned this pull request Sep 19, 2017

Update BenchmarksGames README.txt #14061

Merged

JosephTremoulet mentioned this pull request Sep 19, 2017

Update exclusions for moved tests #14068

Merged

BruceForstall mentioned this pull request Dec 19, 2017

Manually add arm/arm64 BenchmarksGame testing #15574

Merged

sdmaclea mentioned this pull request Jan 31, 2020

[Arm64/Unix] BenchmarksGame fannkuch-redux failing dotnet/runtime#8961

Closed

This was referenced Jan 31, 2020

Tests.lst files need to be regenerated dotnet/runtime#9257

Closed

[RyuJIT/arm32/arm64] Add current BenchmarksGame benchmarks to tests.lst files dotnet/runtime#9430

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update BenchmarksGame benchmarks to latest #13994

Update BenchmarksGame benchmarks to latest #13994

JosephTremoulet commented Sep 14, 2017

JosephTremoulet commented Sep 14, 2017

JosephTremoulet commented Sep 14, 2017

jorive commented Sep 14, 2017

JosephTremoulet commented Sep 14, 2017

AndyAyersMS left a comment

AndyAyersMS Sep 14, 2017

AndyAyersMS Sep 14, 2017

AndyAyersMS Sep 14, 2017

AndyAyersMS Sep 14, 2017

AndyAyersMS Sep 14, 2017

jorive Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

ViktorHofer commented Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

ViktorHofer commented Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

ViktorHofer commented Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

BruceForstall commented Sep 15, 2017

JosephTremoulet commented Sep 16, 2017

JosephTremoulet commented Sep 16, 2017

AndyAyersMS commented Sep 16, 2017

Update BenchmarksGame benchmarks to latest #13994

Update BenchmarksGame benchmarks to latest #13994

Conversation

JosephTremoulet commented Sep 14, 2017

JosephTremoulet commented Sep 14, 2017

JosephTremoulet commented Sep 14, 2017

jorive commented Sep 14, 2017

JosephTremoulet commented Sep 14, 2017

AndyAyersMS left a comment

Choose a reason for hiding this comment

AndyAyersMS Sep 14, 2017

Choose a reason for hiding this comment

AndyAyersMS Sep 14, 2017

Choose a reason for hiding this comment

AndyAyersMS Sep 14, 2017

Choose a reason for hiding this comment

AndyAyersMS Sep 14, 2017

Choose a reason for hiding this comment

AndyAyersMS Sep 14, 2017

Choose a reason for hiding this comment

jorive Sep 15, 2017

Choose a reason for hiding this comment

JosephTremoulet commented Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

ViktorHofer commented Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

ViktorHofer commented Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

ViktorHofer commented Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

JosephTremoulet commented Sep 15, 2017

BruceForstall commented Sep 15, 2017

JosephTremoulet commented Sep 16, 2017

JosephTremoulet commented Sep 16, 2017

AndyAyersMS commented Sep 16, 2017