Change plot resolution to per release in plot over time scripts #254

Misty-W · 2025-02-24T23:01:28Z

Fixes #200.

Change plot resolution to per compiler release in plot over time scripts, so the plots will look like the following:

Note: I pasted (instead of pushed) these plots because they'd have to be removed before merge anyway.

Change plot res to per release in plot over time scripts

bachase · 2025-02-25T15:03:29Z

For the absolute error plot, does each dot still represent a new version of that software package? If so, would we want it to have the label as well?

bachase

The comments on the logic for selecting and filtering which benchmark results end up included in the average performance on earliest version release are my main open questions.

But stepping back, I think this may still be fragile or at least challenging given how the benchmark results are stored by the date of the run currently. What if some days have more benchmark runs than others? If so, will that be a fair comparison if some days average over many runs and others only a single run?

As I (poorly) started to sketch out in #251, I think we should consider switching to associate benchmark results with the git hash (and corresponding git commit time) to ensure a singular mapping from set of compiler versions (as reflected in the lock file in the commit) to benchmark results. Then we can more cleanly pick out results when versions change and plot accordingly.

That is clearly a larger scope than this change, but I suspect getting the plot we want will be hard using the current way we run and store benchmarks. Working around it might be possible, but the code will get more complicated and arguably worth testing to be sure all the cases are handled.

bachase · 2025-02-25T15:23:34Z

benchmarks/scripts/plot_avg_benchmarks_over_time.py


 fig, ax = plt.subplots(2, 1, figsize=(8, 8), sharex=False, dpi=150)
 # Rotate x labels on axes 0
 plt.setp(ax[0].xaxis.get_majorticklabels(), rotation=45)


+filtered_avg_compiled_ratio = avg_compiled_ratio[


How are we thinking about which dates are included in the earlier average and this filter?

avg_compiled_ratio is calculated earlier as:

avg_compiled_ratio = ( df_dates.groupby(["compiler", "date", "compiler_version"])[ "compiled_ratio" ] .mean() .reset_index() .sort_values("date") )

If we happened to have run multiple benchmarks for that compiler on that date, where say some were for version 1.2.0 and then some for version 1.3.0, would this filter be showing results that average over both versions?

Would we also need to filter to only include/average results from that date where the latest compiler version was used?

Because this groups by the date, compiler name, and compiler version, it would not average two different versions together. On the plot I think it would just show them as different datapoints in the same color on the same date and label them correspondingly.

That's correct @jordandsullivan .

Got it. Thank you both.

If we had multiple datapoints, would we want to only plot the "latest version" dot for that date? Just wondering if it might get cluttered otherwise.

bachase · 2025-02-25T15:33:02Z

benchmarks/scripts/plot_avg_benchmarks_over_time.py

+unique_versions = sorted(df_dates["compiler_version"].unique())
+dates = []
+for version in unique_versions:
+    df_versions = df_dates[df_dates["compiler_version"] == version]


Is this compiler_version only the semver string? If so, is it possible that this might mix the same version of different packages? For example cirq 1.2.0 and qiskit 1.2.0? If so, I think that might end up taking the earliest date across the union of those two.

bachase · 2025-02-25T15:35:52Z

benchmarks/scripts/plot_expval_benchmarks_over_time.py

@@ -33,9 +39,18 @@
 # Convert the 'date' column to datetime
 df_all["date"] = pd.to_datetime(df_all["date"])

+unique_versions = sorted(df_all["compiler_version"].unique())


Just to note similar questions as above on whether this is selecting the intended dates and versions.

Good catch, I added a print statement to check, and these are just the numbers, not including the compiler names:
Here are unique_versions ['0.1.0', '0.1.1', '0.2.0', '0.4.0', '0.4.2', '1.2.0', '1.3.0', '1.3.1', '1.3.2', '1.3.3', '1.35.0', '1.36.0', '1.37.0', '1.39.0', '1.4.0', '1.4.1', '1.40.0', '1.5.0.']

I'm aware these are only the numbers, but the actual filtering is done on the date each new version appears in the dataset:

dates = [] for version in unique_versions: df_versions = df_dates[df_dates["compiler_version"] == version] earliest_date = sorted(df_versions["date"].unique())[0] dates.append(earliest_date) filtered_avg_compiled_ratio = avg_compiled_ratio[avg_compiled_ratio["date"].isin(dates)]

Here's a toy example to consider (ignoring the actual data columns)

date, compiler, compiler_version 2025-01-01,cirq,1.2.0 2025-01-02,qiskit,1.2.0 2025-01-10,cirq,1.3.0

I think this might select only 2025-01-1 and 2025-01-10 as earliest dates, as the cirq version 1.2.0 would be earlier than the qiskit 1.2.0. Then the qiskit results for 1.2.0 wouldn't show up until much later? (In this parallel universe where their versions are close and overlapping).

benchmarks/scripts/plot_avg_benchmarks_over_time.py

Co-authored-by: Brad Chase <bchase@ripple.com>

Change plot res to per release

f84a952

Change plot res to per release in plot over time scripts

Misty-W linked an issue Feb 24, 2025 that may be closed by this pull request

Update benchmark plot resolution to monthly #200

Open

Misty-W requested review from natestemen and bachase February 24, 2025 23:02

bachase requested changes Feb 25, 2025

View reviewed changes

Misty-W and others added 2 commits February 26, 2025 11:13

Update benchmarks/scripts/plot_avg_benchmarks_over_time.py

4ba851f

Co-authored-by: Brad Chase <bchase@ripple.com>

Apply ruff fix

ed84844

jordandsullivan mentioned this pull request Feb 26, 2025

Switch PyTket to use FullPeepHoleOptimize #266

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change plot resolution to per release in plot over time scripts #254

Change plot resolution to per release in plot over time scripts #254

Misty-W commented Feb 24, 2025

bachase commented Feb 25, 2025

bachase left a comment

bachase Feb 25, 2025

jordandsullivan Feb 26, 2025

Misty-W Feb 26, 2025

bachase Feb 26, 2025

bachase Feb 25, 2025

bachase Feb 25, 2025

jordandsullivan Feb 26, 2025 •

edited

Loading

Misty-W Feb 26, 2025 •

edited

Loading

bachase Feb 26, 2025 •

edited

Loading

Change plot resolution to per release in plot over time scripts #254

Are you sure you want to change the base?

Change plot resolution to per release in plot over time scripts #254

Conversation

Misty-W commented Feb 24, 2025

bachase commented Feb 25, 2025

bachase left a comment

Choose a reason for hiding this comment

bachase Feb 25, 2025

Choose a reason for hiding this comment

jordandsullivan Feb 26, 2025

Choose a reason for hiding this comment

Misty-W Feb 26, 2025

Choose a reason for hiding this comment

bachase Feb 26, 2025

Choose a reason for hiding this comment

bachase Feb 25, 2025

Choose a reason for hiding this comment

bachase Feb 25, 2025

Choose a reason for hiding this comment

jordandsullivan Feb 26, 2025 • edited Loading

Choose a reason for hiding this comment

Misty-W Feb 26, 2025 • edited Loading

Choose a reason for hiding this comment

bachase Feb 26, 2025 • edited Loading

Choose a reason for hiding this comment

jordandsullivan Feb 26, 2025 •

edited

Loading

Misty-W Feb 26, 2025 •

edited

Loading

bachase Feb 26, 2025 •

edited

Loading