Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve reproducibility and consistency of benchmarking workflow #253

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bachase
Copy link
Collaborator

@bachase bachase commented Feb 24, 2025

These changes will

  1. Reduce the parallelism in the benchmarking runs to minimize contention that might otherwise impact the results. This includes restricting Qiskit-AER simulations to a maximum of 1 parallel thread and run expected value benchmarks separate in parallel separate from timing benchmarks in parallel. and forcing just 1 parallel job in the benchmark script itself.

  2. Use the locked version of dependencies in the benchmarking runs to avoid an uncontrolled change in a dependent library or transitive dependency.

  3. Ensure only one instance of the benchmark github workflow runs at a time by specifying a concurrency group.

This partially addresses #251

Copy link
Member

@natestemen natestemen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. Do you know by how much time this increases the runtime of run_benchmarks.sh?

@bachase
Copy link
Collaborator Author

bachase commented Feb 25, 2025

Makes sense to me. Do you know by how much time this increases the runtime of run_benchmarks.sh?

Looking at https://github.com/unitaryfund/ucc/actions/workflows/ucc-benchmarks.yml?query=branch%3Abc-test-AER-1-thread, from about 10 minutes to about 30 minutes.

Per some more recent discussion in #251, we could

  1. Retain the parallelism for the expected value benchmark runs as those are not measuring timing
  2. Retain parallelism for all benchmark runs, but just schedule the expected value ones separate from the timing ones.

Option 2 would keep the overall runtime fastest, but option 1 is more conservative until we get a stronger handle on how the parallel jobs are getting scheduled on the runner. But given the historical benchmark runs, this might be overly conservative as things were fairly steady for a while.

These changes will
1. Reduce the parallelism in the benchmarking runs to minimize
contention that might otherwise impact the results. This includes restricting
Qiskit-AER simulations to a maxmimum of 1 parallel thread and forcing just 1 parallel
job in the benchmark script itself.

2. Use the locked version of dependencies in the benchmarking runs to avoid an
uncontrolled change in a dependent library or transitive dependency.

3. Ensure only one instance of the benchmark github workflow runs at a time by
specifying a concurrency group.
@bachase bachase force-pushed the 251-improve-benchmarks branch from 3a57b16 to 10b8bf5 Compare February 25, 2025 20:28
@bachase
Copy link
Collaborator Author

bachase commented Feb 25, 2025

Updated this to be Option 2, per latest discussion with @jordandsullivan in #251.

@bachase
Copy link
Collaborator Author

bachase commented Feb 26, 2025

Note the commit with benchmark results that show this addresses the recent increase in PyTKET (thanks @jordandsullivan for kicking them off).

bench

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants