-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce complexity and cloud storage needs across benchmarking workflows #16396
Labels
cleanup 🧹
infrastructure/benchmark
Relating to benchmarking infrastructure
infrastructure
Relating to build systems, CI, or testing
Comments
ScottTodd
added
infrastructure
Relating to build systems, CI, or testing
cleanup 🧹
infrastructure/benchmark
Relating to benchmarking infrastructure
labels
Feb 13, 2024
ScottTodd
added a commit
that referenced
this issue
May 14, 2024
This drops support for capturing traces as part of CI benchmarking to fix #16856. This PR is a synced and updated version of #16857. While traces are invaluable in analyzing performance, * This implementation is difficult to maintain (updating IREE's Tracy version can't be performed without also building for multiple operating systems and uploading binary files to a cloud bucket with limited permissions) * Trace collection nearly doubles CI time for every benchmark run (i.e. 10m->20m or 20m->40m), leading to occasionally long queueing (2h+) * Trace collection contributes to cloud storage and network usage (there are larger offenders: #16396, but we still need to trim costs) --------- Co-authored-by: Benoit Jacob <jacob.benoit.1@gmail.com>
bangtianliu
pushed a commit
to bangtianliu/iree
that referenced
this issue
Jun 5, 2024
This drops support for capturing traces as part of CI benchmarking to fix iree-org#16856. This PR is a synced and updated version of iree-org#16857. While traces are invaluable in analyzing performance, * This implementation is difficult to maintain (updating IREE's Tracy version can't be performed without also building for multiple operating systems and uploading binary files to a cloud bucket with limited permissions) * Trace collection nearly doubles CI time for every benchmark run (i.e. 10m->20m or 20m->40m), leading to occasionally long queueing (2h+) * Trace collection contributes to cloud storage and network usage (there are larger offenders: iree-org#16396, but we still need to trim costs) --------- Co-authored-by: Benoit Jacob <jacob.benoit.1@gmail.com>
LLITCHEV
pushed a commit
to LLITCHEV/iree
that referenced
this issue
Jul 30, 2024
This drops support for capturing traces as part of CI benchmarking to fix iree-org#16856. This PR is a synced and updated version of iree-org#16857. While traces are invaluable in analyzing performance, * This implementation is difficult to maintain (updating IREE's Tracy version can't be performed without also building for multiple operating systems and uploading binary files to a cloud bucket with limited permissions) * Trace collection nearly doubles CI time for every benchmark run (i.e. 10m->20m or 20m->40m), leading to occasionally long queueing (2h+) * Trace collection contributes to cloud storage and network usage (there are larger offenders: iree-org#16396, but we still need to trim costs) --------- Co-authored-by: Benoit Jacob <jacob.benoit.1@gmail.com> Signed-off-by: Lubo Litchev <lubol@google.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
cleanup 🧹
infrastructure/benchmark
Relating to benchmarking infrastructure
infrastructure
Relating to build systems, CI, or testing
I spotted some low hanging fruit here: https://discord.com/channels/689900678990135345/1166024193599615006/1207085419959947294 and here: https://groups.google.com/g/iree-discuss/c/uy0L4Vdl3hs/m/YLe0iLCGAAAJ.
process_benchmark_results
step takes around 2 minutes to download a mysteriousiree-oss/benchmark-report
Dockerfile. The generation scripts only need a few Python deps (markdown_strings
,requests
), so they could just pip install what they need directly.compilation_benchmarks
job could be folded intobuild_e2e_test_artifacts
. Then, we wouldn't need to upload and store largecompile-stats/module.vmfb
files or spend 30 seconds + downloading those files for 0-2 seconds of statistics aggregation and uploading. If there are issues (for whatever reason) with the build machine not having the right setup for uploading to the dashboard server, we could pass results via workflow artifacts. No need to send 50-100GB of (what should be) transient files over the network.The text was updated successfully, but these errors were encountered: