Combine recent data files w/ incomplete benchmarks #207

Misty-W · 2025-02-04T19:52:20Z

Fixes #206.

After removing the two problematic data files from yesterday and earlier today (the second a re-run of yesterday's benchmarks), I recovered the desired plot, which includes the latest results from today.

The plot was obtained from running the plotting script locally, which pulls existing results (including the latest run today on main), and we can see that the data is virtually unchanged, as expected.

Only the results from earlier today (re-run of yesterday's incomplete benchmark run) are currently visible in the plot over time on main because the plotting script plot_avg_benchmarks_over_time.py looks for unique dates in the dataframe, and pandas.core.groupby.SeriesGroupBy.unique returns the first unique value for each group.

I haven't determined what caused the earlier runs to give strange or incomplete results, but our latest data looks good, so it has "fixed itself" for the moment. I'll still investigate to see if recurrence is likely / easily preventable.

bachase · 2025-02-04T20:56:38Z

My understanding so far is that a github actions runs regularly to update the plots. Looking at the actions history in Github I see several runs over time. Is there an artifact or label of files that shows which of the Github action runs was used to generate the latest plots show in the readme? I'm wondering if this would help identify which GH action run might have led to the issue.

Misty-W · 2025-02-04T21:19:57Z

Is there an artifact or label of files that shows which of the Github action runs was used to generate the latest plots show in the readme? I'm wondering if this would help identify which GH action run might have led to the issue.

Not a label of which GH action run, but we can match the timestamps on the files with the timestamps of the actions. The "run-benchmarks" action is run upon merging a PR, and then there is a commit titled "Update benchmark results [benchmark chore]" and a set of actions run on that (skipping benchmarks to avoid an infinite loop).

Misty-W · 2025-02-05T16:51:59Z

Some additional info:

benchmarks/results/gates_[yyyy-mm-dd_hh].csv should have 24 entries of gate counts, but the files deleted by this PR, benchmarks/results/gates_2025-02-03_23.csv and benchmarks/results/gates_2025-02-04_00.csv have 10 and 14 respectively.

Edit: as explained in #206 (comment), benchmarks/results/gates_2025-02-03_23.csv and benchmarks/results/gates_2025-02-04_00.csv belong to the same run, just the hour and day changed during the run. The second attempt failed and wasn't committed to main, so it never showed up on the plot.

The logs of benchmarking attempt 1, corresponding to benchmarks/results/gates_2025-02-03_23.csv show all circuits run with expected gate counts, but the results of some circuits are shown in a different order than expected.

The logs of benchmarking attempt 2, corresponding to benchmarks/results/gates_2025-02-04_00.csv show similar behavior- all circuits run with expected gate counts, but the results of some circuits are shown in a different order.

Misty-W · 2025-02-05T17:40:55Z

This PR is a containment to fix our README and clean up our results folder.

I propose to solve the underlying issue of saving one result file per run for each metric in a separate PR.

natestemen

I believe the root cause of this issue is that run-benchmarks attempt # 1 on commit c135493 began close to midnight UTC, so some data were saved on Feb 3 at 23h and and some on Feb 4 at 00h, resulting in separate files and separate points on the plot over time.

hahah that's a pretty funny corner case. Would it be better to merge the two csv files then as opposed to deleting them?

Misty-W · 2025-02-05T18:36:25Z

hahah that's a pretty funny corner case. Would it be better to merge the two csv files then as opposed to deleting them?

Good idea- I combined the files and updated the title of the PR before merging.

bachase · 2025-02-05T19:04:55Z

Is there a (small) risk this could happen again? Wondering if its worth a follow-up ticket in the futre to refacto and avoid re-occuring.

natestemen · 2025-02-05T19:08:51Z

Yes definitely. I'd think that each time the script is kicked off, the date should be set, and then not "checked" again.

Misty-W · 2025-02-06T00:07:12Z

Yes, I opened #213 to fix the multiple timestamp issue.

remove erroneous files

0d84e2f

Misty-W linked an issue Feb 4, 2025 that may be closed by this pull request

Unexpected spikes in benchmark plots #206

Closed

Misty-W marked this pull request as ready for review February 4, 2025 20:43

Misty-W requested a review from natestemen February 5, 2025 12:44

natestemen approved these changes Feb 5, 2025

View reviewed changes

restore mistakenly split files into combined file

2a025d2

Misty-W changed the title ~~Remove recent data files w/ incomplete benchmarks~~ Combine recent data files w/ incomplete benchmarks Feb 5, 2025

Misty-W merged commit ff7d4da into main Feb 5, 2025
1 check passed

Misty-W deleted the 206-unexpected-spikes-in-benchmark-plots branch February 5, 2025 18:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combine recent data files w/ incomplete benchmarks #207

Combine recent data files w/ incomplete benchmarks #207

Misty-W commented Feb 4, 2025 •

edited

Loading

bachase commented Feb 4, 2025

Misty-W commented Feb 4, 2025

Misty-W commented Feb 5, 2025 •

edited

Loading

Misty-W commented Feb 5, 2025 •

edited

Loading

natestemen left a comment

Misty-W commented Feb 5, 2025

bachase commented Feb 5, 2025

natestemen commented Feb 5, 2025

Misty-W commented Feb 6, 2025

Combine recent data files w/ incomplete benchmarks #207

Combine recent data files w/ incomplete benchmarks #207

Conversation

Misty-W commented Feb 4, 2025 • edited Loading

bachase commented Feb 4, 2025

Misty-W commented Feb 4, 2025

Misty-W commented Feb 5, 2025 • edited Loading

Misty-W commented Feb 5, 2025 • edited Loading

natestemen left a comment

Choose a reason for hiding this comment

Misty-W commented Feb 5, 2025

bachase commented Feb 5, 2025

natestemen commented Feb 5, 2025

Misty-W commented Feb 6, 2025

Misty-W commented Feb 4, 2025 •

edited

Loading

Misty-W commented Feb 5, 2025 •

edited

Loading

Misty-W commented Feb 5, 2025 •

edited

Loading