-
-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrapping-up benchmarks #1093
Merged
Merged
Wrapping-up benchmarks #1093
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Introduced in some previous commits, so basically reverting that. Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
likely due to an wrong merge conflict resolution. Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
This is not supposed to happen, as only replayed sampler/fuzzer can stop. Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
Fixes: - Do not allow bench with no arguments; this causes a compiler panic down the line otherwise. - Do not force the return value to be a boolean or void. We do not actually control what's returned by benchmark, so anything really works here. Refactor: - Re-use code between test and bench type-checking; especially the bits related to gathering information about the via arguments. There's quite a lot and simply copy-pasting everything will likely cause issues and discrepency at the first change. Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
In particular, using a concrete enum instead of a string to avoid an unnecessary incomplete pattern-match, and remove superfluous comments. Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
This commit removes some duplication between bench and test runners, as well as fixing the results coming out of running benchmarks. Running benchmarks is expected to yield multiple measures, for each of the iteration. For now, it'll suffice to show results for each size; but eventually, we'll possibly try to interpolate results with different curves and pick the best candidate. Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
The idea is to get a good sample of measures from running benchmarks with various sizes, so one can get an idea of how well a function performs at various sizes. Given that size can be made arbitrarily large, and that we currently report all benchmarks, I installed a fibonacci heuristic to gather data points from 0 to the max size using an increasing stepping. Defined as a trait as I already anticipate we might need different sizing strategy, likely driven by the user via a command-line option; but for now, this will do. Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
Going for a terminal plot, for now, as this was the original idea and it is immediately visual. All benchmark points can also be obtained as JSON when redirecting the output, like for tests. So all-in-all, we provide a flexible output which should be useful. Whether it is the best we can do, time (and people/users) will tell. Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
This is likely even better than what was done for property testing. We shall revise that one perhaps one day. Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
ebb29c0
to
451179f
Compare
Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
Signed-off-by: KtorZ <5680256+KtorZ@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Preview
CLI
Defining benchmarks
Running benchmarks
bytearray_length.json
bytearray_comparison.json
```json { "benchmarks": [ { "name": "bytearray_comparison", "module": "tests", "measures": [ { "size": 0, "memory": 3293, "cpu": 823816 }, { "size": 1, "memory": 3293, "cpu": 824386 }, { "size": 2, "memory": 3293, "cpu": 824994 }, { "size": 3, "memory": 3293, "cpu": 825602 }, { "size": 4, "memory": 3293, "cpu": 826210 }, { "size": 5, "memory": 3293, "cpu": 826818 }, { "size": 6, "memory": 3293, "cpu": 827426 }, { "size": 7, "memory": 3293, "cpu": 828034 }, { "size": 8, "memory": 3293, "cpu": 828642 }, { "size": 9, "memory": 3293, "cpu": 829250 }, { "size": 10, "memory": 3293, "cpu": 829858 }, { "size": 11, "memory": 3293, "cpu": 830466 }, { "size": 12, "memory": 3293, "cpu": 831074 }, { "size": 13, "memory": 3293, "cpu": 831682 }, { "size": 14, "memory": 3293, "cpu": 832290 }, { "size": 15, "memory": 3293, "cpu": 832898 }, { "size": 16, "memory": 3293, "cpu": 833506 }, { "size": 17, "memory": 3293, "cpu": 834114 }, { "size": 18, "memory": 3293, "cpu": 834722 }, { "size": 19, "memory": 3293, "cpu": 835330 }, { "size": 20, "memory": 3293, "cpu": 835938 }, { "size": 21, "memory": 3293, "cpu": 836546 }, { "size": 22, "memory": 3293, "cpu": 837154 }, { "size": 23, "memory": 3293, "cpu": 837762 }, { "size": 24, "memory": 3293, "cpu": 838370 }, { "size": 25, "memory": 3293, "cpu": 838978 }, { "size": 26, "memory": 3293, "cpu": 839586 }, { "size": 27, "memory": 3293, "cpu": 840194 }, { "size": 28, "memory": 3293, "cpu": 840802 }, { "size": 29, "memory": 3293, "cpu": 841410 }, { "size": 30, "memory": 3293, "cpu": 842018 }, { "size": 31, "memory": 3293, "cpu": 842626 }, { "size": 32, "memory": 3293, "cpu": 843234 }, { "size": 33, "memory": 3293, "cpu": 843842 }, { "size": 34, "memory": 3293, "cpu": 844450 }, { "size": 35, "memory": 3293, "cpu": 845058 }, { "size": 36, "memory": 3293, "cpu": 845666 }, { "size": 37, "memory": 3293, "cpu": 846274 }, { "size": 38, "memory": 3293, "cpu": 846882 }, { "size": 39, "memory": 3293, "cpu": 847490 }, { "size": 40, "memory": 3293, "cpu": 848098 }, { "size": 41, "memory": 3293, "cpu": 848706 }, { "size": 42, "memory": 3293, "cpu": 849314 }, { "size": 43, "memory": 3293, "cpu": 849922 }, { "size": 44, "memory": 3293, "cpu": 850530 }, { "size": 45, "memory": 3293, "cpu": 851138 }, { "size": 46, "memory": 3293, "cpu": 851746 }, { "size": 47, "memory": 3293, "cpu": 852354 }, { "size": 48, "memory": 3293, "cpu": 852962 }, { "size": 49, "memory": 3293, "cpu": 853570 }, { "size": 50, "memory": 3293, "cpu": 854178 } ] } ], "seed": 3601959169 } ```Error cases
Changelog
📍 remove unnecessary intermediate variables
Introduced in some previous commits, so basically reverting that.
📍 remove duplicate entry in CHANGELOG
likely due to an wrong merge conflict resolution.
📍 actually fail if a (seeded) sampler return None
This is not supposed to happen, as only replayed sampler/fuzzer can stop.
📍 minor aesthetic changes in test framework.
📍 refactor and fix benchmark type-checking
Fixes:
Do not allow bench with no arguments; this causes a compiler panic down the line otherwise.
Do not force the return value to be a boolean or void. We do not actually control what's returned by benchmark, so anything really works here.
Refactor:
📍 Add additional test to check for Sampler alias formatting.
📍 fixup aesthetics
📍 more aesthetic changes.
In particular, using a concrete enum instead of a string to avoid an unnecessary incomplete pattern-match, and remove superfluous comments.
📍 fuse together bench & test runners, and collect all bench measures.
This commit removes some duplication between bench and test runners, as well as fixing the results coming out of running benchmarks.
Running benchmarks is expected to yield multiple measures, for each of the iteration. For now, it'll suffice to show results for each size; but eventually, we'll possibly try to interpolate results with different curves and pick the best candidate.
📍 rework sizing of benchmarks, taking measures at different points
The idea is to get a good sample of measures from running benchmarks with various sizes, so one can get an idea of how well a function performs at various sizes.
Given that size can be made arbitrarily large, and that we currently report all benchmarks, I installed a fibonacci heuristic to gatherdata points from 0 to the max size using an increasing stepping.
Defined as a trait as I already anticipate we might need different sizing strategy, likely driven by the user via a command-line option; but for now, this will do.
📍 remove unnecessary intermediate variables
Introduced in some previous commits, so basically reverting that.
📍 rework benchmarks output
Going for a terminal plot, for now, as this was the original idea and it is immediately visual. All benchmark points can also be obtained as JSON when redirecting the output, like for tests. So all-in-all, we provide a flexible output which should be useful. Whether it is the best we can do, time (and people/users) will tell.
📍 fix benchmark output when either the sampler or bench fails
This is likely even better than what was done for property testing. We shall revise that one perhaps one day.
📍 Update CHANGELOG w.r.t benchmarks