Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine recent data files w/ incomplete benchmarks #207

Merged
merged 2 commits into from
Feb 5, 2025

Conversation

Misty-W
Copy link
Collaborator

@Misty-W Misty-W commented Feb 4, 2025

Fixes #206.

After removing the two problematic data files from yesterday and earlier today (the second a re-run of yesterday's benchmarks), I recovered the desired plot, which includes the latest results from today.

The plot was obtained from running the plotting script locally, which pulls existing results (including the latest run today on main), and we can see that the data is virtually unchanged, as expected.

Only the results from earlier today (re-run of yesterday's incomplete benchmark run) are currently visible in the plot over time on main because the plotting script plot_avg_benchmarks_over_time.py looks for unique dates in the dataframe, and pandas.core.groupby.SeriesGroupBy.unique returns the first unique value for each group.

I haven't determined what caused the earlier runs to give strange or incomplete results, but our latest data looks good, so it has "fixed itself" for the moment. I'll still investigate to see if recurrence is likely / easily preventable.

image

@Misty-W Misty-W linked an issue Feb 4, 2025 that may be closed by this pull request
@Misty-W Misty-W marked this pull request as ready for review February 4, 2025 20:43
@bachase
Copy link
Collaborator

bachase commented Feb 4, 2025

My understanding so far is that a github actions runs regularly to update the plots. Looking at the actions history in Github I see several runs over time. Is there an artifact or label of files that shows which of the Github action runs was used to generate the latest plots show in the readme? I'm wondering if this would help identify which GH action run might have led to the issue.

@Misty-W
Copy link
Collaborator Author

Misty-W commented Feb 4, 2025

Is there an artifact or label of files that shows which of the Github action runs was used to generate the latest plots show in the readme? I'm wondering if this would help identify which GH action run might have led to the issue.

Not a label of which GH action run, but we can match the timestamps on the files with the timestamps of the actions. The "run-benchmarks" action is run upon merging a PR, and then there is a commit titled "Update benchmark results [benchmark chore]" and a set of actions run on that (skipping benchmarks to avoid an infinite loop).

@Misty-W Misty-W requested a review from natestemen February 5, 2025 12:44
@Misty-W
Copy link
Collaborator Author

Misty-W commented Feb 5, 2025

Some additional info:

benchmarks/results/gates_[yyyy-mm-dd_hh].csv should have 24 entries of gate counts, but the files deleted by this PR, benchmarks/results/gates_2025-02-03_23.csv and benchmarks/results/gates_2025-02-04_00.csv have 10 and 14 respectively.

Edit: as explained in #206 (comment), benchmarks/results/gates_2025-02-03_23.csv and benchmarks/results/gates_2025-02-04_00.csv belong to the same run, just the hour and day changed during the run. The second attempt failed and wasn't committed to main, so it never showed up on the plot.

The logs of benchmarking attempt 1, corresponding to benchmarks/results/gates_2025-02-03_23.csv show all circuits run with expected gate counts, but the results of some circuits are shown in a different order than expected.

The logs of benchmarking attempt 2, corresponding to benchmarks/results/gates_2025-02-04_00.csv show similar behavior- all circuits run with expected gate counts, but the results of some circuits are shown in a different order.

@Misty-W
Copy link
Collaborator Author

Misty-W commented Feb 5, 2025

This PR is a containment to fix our README and clean up our results folder.

I propose to solve the underlying issue of saving one result file per run for each metric in a separate PR.

Copy link
Member

@natestemen natestemen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the root cause of this issue is that run-benchmarks attempt # 1 on commit c135493 began close to midnight UTC, so some data were saved on Feb 3 at 23h and and some on Feb 4 at 00h, resulting in separate files and separate points on the plot over time.

hahah that's a pretty funny corner case. Would it be better to merge the two csv files then as opposed to deleting them?

@Misty-W Misty-W changed the title Remove recent data files w/ incomplete benchmarks Combine recent data files w/ incomplete benchmarks Feb 5, 2025
@Misty-W Misty-W merged commit ff7d4da into main Feb 5, 2025
1 check passed
@Misty-W Misty-W deleted the 206-unexpected-spikes-in-benchmark-plots branch February 5, 2025 18:33
@Misty-W
Copy link
Collaborator Author

Misty-W commented Feb 5, 2025

hahah that's a pretty funny corner case. Would it be better to merge the two csv files then as opposed to deleting them?

Good idea- I combined the files and updated the title of the PR before merging.

@bachase
Copy link
Collaborator

bachase commented Feb 5, 2025

Is there a (small) risk this could happen again? Wondering if its worth a follow-up ticket in the futre to refacto and avoid re-occuring.

@natestemen
Copy link
Member

Yes definitely. I'd think that each time the script is kicked off, the date should be set, and then not "checked" again.

@Misty-W
Copy link
Collaborator Author

Misty-W commented Feb 6, 2025

Yes, I opened #213 to fix the multiple timestamp issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected spikes in benchmark plots
3 participants