Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify time spent in data generation #3605

Merged
merged 3 commits into from
Apr 2, 2023

Conversation

andreareina
Copy link

@andreareina andreareina commented Apr 1, 2023

A simple mean of the percentage of time taken in data generation can be misleading: "1-100 ms, ~50% in data generation" can mean:

  • generation always takes 1ms, rest of the test takes 0-99 ms
  • generation takes 1-50 ms, rest of the test takes 1-50 ms

Specifying a range the same way as the runtime prevents confusing these cases.

Fixes #3598.

A simple mean of the percentage of time taken in data generation can
be misleading: "1-100 ms, ~50% in data generation" can mean:
- generation always takes 1ms, rest of the test takes 0-99 ms
- generation takes 1-50 ms, rest of the test takes 1-50 ms

Specifying a range the same way as the runtime prvents confusing these cases.
@andreareina
Copy link
Author

See issue #3598

@andreareina
Copy link
Author

Tests are failing in https://github.com/andreareina/hypothesis/blob/data-generation-runtime/hypothesis-python/tests/cover/test_statistical_events.py#L163-L168. The "0%" case maps cleanly onto "< 1ms", but the "50%" and "100%" cases aren't as clean. On my machine the generation consistently takes 49ms, but I could imagine that other machines might take 50, 51ms or even a range.

@andreareina
Copy link
Author

Writing tests for format_ms(), I've got it for the "< 1ms" and "~ Xms" cases, I'm a bit unsure about the "~ x-y ms" case without sorting the inputs and getting the 5% and 95% percentiles, i.e. exactly duplicating the logic of the function under test.

@andreareina andreareina force-pushed the data-generation-runtime branch 2 times, most recently from e5cbb6f to 3c0ec72 Compare April 2, 2023 16:50
Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fantastic! Thanks so much Andrea 🤩

@Zac-HD Zac-HD merged commit 8f1f79e into HypothesisWorks:master Apr 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

timings in statistics misleading wrt time spent generating data
2 participants