Reduce complexity and cloud storage needs across benchmarking workflows #16396

ScottTodd · 2024-02-13T23:51:47Z

I spotted some low hanging fruit here: https://discord.com/channels/689900678990135345/1166024193599615006/1207085419959947294 and here: https://groups.google.com/g/iree-discuss/c/uy0L4Vdl3hs/m/YLe0iLCGAAAJ.

The process_benchmark_results step takes around 2 minutes to download a mysterious iree-oss/benchmark-report Dockerfile. The generation scripts only need a few Python deps (markdown_strings, requests), so they could just pip install what they need directly.
Benchmark execution jobs are spending upwards of 30 seconds checking out runtime submodules. They likely don't need any submodules at all.
The compilation_benchmarks job could be folded into build_e2e_test_artifacts. Then, we wouldn't need to upload and store large compile-stats/module.vmfb files or spend 30 seconds + downloading those files for 0-2 seconds of statistics aggregation and uploading. If there are issues (for whatever reason) with the build machine not having the right setup for uploading to the dashboard server, we could pass results via workflow artifacts. No need to send 50-100GB of (what should be) transient files over the network.

The text was updated successfully, but these errors were encountered:

This drops support for capturing traces as part of CI benchmarking to fix #16856. This PR is a synced and updated version of #16857. While traces are invaluable in analyzing performance, * This implementation is difficult to maintain (updating IREE's Tracy version can't be performed without also building for multiple operating systems and uploading binary files to a cloud bucket with limited permissions) * Trace collection nearly doubles CI time for every benchmark run (i.e. 10m->20m or 20m->40m), leading to occasionally long queueing (2h+) * Trace collection contributes to cloud storage and network usage (there are larger offenders: #16396, but we still need to trim costs) --------- Co-authored-by: Benoit Jacob <jacob.benoit.1@gmail.com>

This drops support for capturing traces as part of CI benchmarking to fix iree-org#16856. This PR is a synced and updated version of iree-org#16857. While traces are invaluable in analyzing performance, * This implementation is difficult to maintain (updating IREE's Tracy version can't be performed without also building for multiple operating systems and uploading binary files to a cloud bucket with limited permissions) * Trace collection nearly doubles CI time for every benchmark run (i.e. 10m->20m or 20m->40m), leading to occasionally long queueing (2h+) * Trace collection contributes to cloud storage and network usage (there are larger offenders: iree-org#16396, but we still need to trim costs) --------- Co-authored-by: Benoit Jacob <jacob.benoit.1@gmail.com>

This drops support for capturing traces as part of CI benchmarking to fix iree-org#16856. This PR is a synced and updated version of iree-org#16857. While traces are invaluable in analyzing performance, * This implementation is difficult to maintain (updating IREE's Tracy version can't be performed without also building for multiple operating systems and uploading binary files to a cloud bucket with limited permissions) * Trace collection nearly doubles CI time for every benchmark run (i.e. 10m->20m or 20m->40m), leading to occasionally long queueing (2h+) * Trace collection contributes to cloud storage and network usage (there are larger offenders: iree-org#16396, but we still need to trim costs) --------- Co-authored-by: Benoit Jacob <jacob.benoit.1@gmail.com> Signed-off-by: Lubo Litchev <lubol@google.com>

ScottTodd added infrastructure Relating to build systems, CI, or testing cleanup 🧹 infrastructure/benchmark Relating to benchmarking infrastructure labels Feb 13, 2024

ScottTodd mentioned this issue May 13, 2024

Drop Tracy from CI benchmarks. #17383

Merged

ScottTodd mentioned this issue Aug 13, 2024

Delete all in-tree benchmark infrastructure code. #18144

Merged

ScottTodd closed this as completed in 9b05f17 Aug 14, 2024

ScottTodd closed this as completed in #18144 Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce complexity and cloud storage needs across benchmarking workflows #16396

Reduce complexity and cloud storage needs across benchmarking workflows #16396

ScottTodd commented Feb 13, 2024

Reduce complexity and cloud storage needs across benchmarking workflows #16396

Reduce complexity and cloud storage needs across benchmarking workflows #16396

Comments

ScottTodd commented Feb 13, 2024