-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create the new benchmark database #5839
Conversation
@huydhn is attempting to deploy a commit to the Meta Open Source Team on Vercel. A member of the Team first needs to authorize it. |
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
cc @FindHao Here is the database schema to hold the benchmark results for TritonBench. This supports adding new metrics and operators (reused the |
Thank you! Will sync up with you offline. |
@xuzhao9 Much appreciate if you could help take a look at the schema. Ideally, it needs to be flexible enough to contain all the information required by https://fburl.com/gdoc/6yxdmuh5 I'm working with @FindHao to nail this down atm. |
@huydhn We are planning the following JSON output schema for Tritonbench nightly run: https://docs.google.com/document/d/1jttjVsYqW_rQNISp1jX6ysCblFZA1erGIM_7yQWHg_M/edit I am wondering are we planning to use the same schema for Torchbench (module-level benchmarking) and Tritonbench (operator-level benchmarking)? |
Thank you for sharing this doc. I'm looking at the schema there and the general schema here should be able to cover all the fields needed by TritonBench. There are fields to keep information about the runner (cpu, gpu devices), about the benchmark (name, mode, precision), and also about the change (sha, branch) and the important dependencies (triton sha and branch). The good news is that it's easy to write a simple script to convert TritonBench nightly run to the format here and keep both formats if needed. For example, I have this script https://github.com/pytorch/executorch/blob/main/.github/scripts/extract_benchmark_results.py to convert the mobile benchmark results from ExecuTorch to this format.
That's the goal. I'm looking to converge the different benchmark schema we are using into one, so that we can store it properly in a database table and build and API around it to allow people to access the data on OSS. So, this schema can be used by other benchmark that people are building too, TorchChat and ExecuTorch are some examples. |
mem_info String, | ||
avail_mem_in_gb UInt32, | ||
gpu_info String, | ||
gpu_count UInt32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jack-Khuu Just FYI, I think that keeping the gpu_info and gpu_count here would capture the case of having more than one GPUs in one runner. My assumption is that all GPUs on the same runner will be the same. For distributed, there might be more than one runners I guess, so let me make this a list of runners instead to capture that
…5845) After #5839, it's time to upload the GHA to upload to S3. I'll update the S3 lambda replicator in a separate PR. ### Testing * Locally ``` # Backward compatibility, upload to both dynamoDB and S3 for v2 schema $python upload_benchmark_results.py --benchmark-results-dir benchmark-results-dir-for-testing/v2 --schema-version v2 --dry-run INFO:root:Uploading benchmark-results-dir-for-testing/v2/android-artifacts-31017223108.json to dynamoDB (v2) INFO:root:Writing 16 documents to DynamoDB torchci-oss-ci-benchmark INFO:root:Upload benchmark-results-dir-for-testing/v2/android-artifacts-31017223108.json to s3://ossci-benchmarks/v2/pytorch/executorch/12345/31017223108/android-artifacts-31017223108.json INFO:root:Uploading benchmark-results-dir-for-testing/v2/android-artifacts-31017223431.json to dynamoDB (v2) INFO:root:Writing 12 documents to DynamoDB torchci-oss-ci-benchmark INFO:root:Upload benchmark-results-dir-for-testing/v2/android-artifacts-31017223431.json to s3://ossci-benchmarks/v2/pytorch/executorch/12345/31017223431/android-artifacts-31017223431.json # We use only S3 for v3 schema $python upload_benchmark_results.py --benchmark-results-dir benchmark-results-dir-for-testing/v3 --schema-version v3 INFO:root:Upload benchmark-results-dir-for-testing/v3/mock.json to s3://ossci-benchmarks/v3/pytorch/pytorch/1/1/mock.json ``` * CI * v2 https://github.com/pytorch/test-infra/actions/runs/11606273442/job/32318017857?pr=5845#step:4:55 * v3 https://github.com/pytorch/test-infra/actions/runs/11606273442/job/32318017857?pr=5845#step:5:43 * Test PR on ExecuTorch to use the new version https://github.com/pytorch/executorch/actions/runs/11606339159 to see that the files are uploaded to S3 https://github.com/pytorch/executorch/actions/runs/11606339159/job/32318826449#step:8:87
To ease the process of gathering the benchmark metadata before uploading the the database, I'm adding a script `.github/scripts/benchmarks/gather_metadata.py` to gather this information and pass it to the upload script. From #5839, the benchmark metadata includes the following required fields: ``` -- Metadata `timestamp` UInt64, `schema_version` String DEFAULT 'v3', `name` String, -- About the change `repo` String DEFAULT 'pytorch/pytorch', `head_branch` String, `head_sha` String, `workflow_id` UInt64, `run_attempt` UInt32, `job_id` UInt64, -- The raw records on S3 `s3_path` String, ``` I'm going to test this out with PT2 compiler instruction count benchmark at pytorch/pytorch#140493 ### Testing https://github.com/pytorch/test-infra/actions/runs/11831746632/job/32967412160?pr=5918#step:5:105 gathers the metadata and upload the benchmark results correctly Also, an actual upload at https://github.com/pytorch/pytorch/actions/runs/11831781500/job/33006545698#step:24:138
I'm trying to make this benchmark results available on OSS benchmark database, so that people can query it from outside. The first step is to also record the results in the JSON format compatible with the database schema defined in pytorch/test-infra#5839. Existing CSV files remain unchanged. ### Testing The JSON results are uploaded as artifacts to S3 https://github.com/pytorch/pytorch/actions/runs/11809725848/job/32901411180#step:26:13, for example https://gha-artifacts.s3.amazonaws.com/pytorch/pytorch/11809725848/1/artifact/test-jsons-test-pr_time_benchmarks-1-1-linux.g4dn.metal.nvidia.gpu_32901411180.zip Pull Request resolved: #140493 Approved by: https://github.com/laithsakka
I'm trying to make this benchmark results available on OSS benchmark database, so that people can query it from outside. The first step is to also record the results in the JSON format compatible with the database schema defined in pytorch/test-infra#5839. Existing CSV files remain unchanged. ### Testing The JSON results are uploaded as artifacts to S3 https://github.com/pytorch/pytorch/actions/runs/11809725848/job/32901411180#step:26:13, for example https://gha-artifacts.s3.amazonaws.com/pytorch/pytorch/11809725848/1/artifact/test-jsons-test-pr_time_benchmarks-1-1-linux.g4dn.metal.nvidia.gpu_32901411180.zip Pull Request resolved: pytorch#140493 Approved by: https://github.com/laithsakka
I'm trying to make this benchmark results available on OSS benchmark database, so that people can query it from outside. The first step is to also record the results in the JSON format compatible with the database schema defined in pytorch/test-infra#5839. Existing CSV files remain unchanged. ### Testing The JSON results are uploaded as artifacts to S3 https://github.com/pytorch/pytorch/actions/runs/11809725848/job/32901411180#step:26:13, for example https://gha-artifacts.s3.amazonaws.com/pytorch/pytorch/11809725848/1/artifact/test-jsons-test-pr_time_benchmarks-1-1-linux.g4dn.metal.nvidia.gpu_32901411180.zip Pull Request resolved: #140493 Approved by: https://github.com/laithsakka
I'm trying to make this benchmark results available on OSS benchmark database, so that people can query it from outside. The first step is to also record the results in the JSON format compatible with the database schema defined in pytorch/test-infra#5839. Existing CSV files remain unchanged. ### Testing The JSON results are uploaded as artifacts to S3 https://github.com/pytorch/pytorch/actions/runs/11809725848/job/32901411180#step:26:13, for example https://gha-artifacts.s3.amazonaws.com/pytorch/pytorch/11809725848/1/artifact/test-jsons-test-pr_time_benchmarks-1-1-linux.g4dn.metal.nvidia.gpu_32901411180.zip Pull Request resolved: pytorch#140493 Approved by: https://github.com/laithsakka
I'm trying to make this benchmark results available on OSS benchmark database, so that people can query it from outside. The first step is to also record the results in the JSON format compatible with the database schema defined in pytorch/test-infra#5839. Existing CSV files remain unchanged. ### Testing The JSON results are uploaded as artifacts to S3 https://github.com/pytorch/pytorch/actions/runs/11809725848/job/32901411180#step:26:13, for example https://gha-artifacts.s3.amazonaws.com/pytorch/pytorch/11809725848/1/artifact/test-jsons-test-pr_time_benchmarks-1-1-linux.g4dn.metal.nvidia.gpu_32901411180.zip Pull Request resolved: pytorch#140493 Approved by: https://github.com/laithsakka
I'm trying to make this benchmark results available on OSS benchmark database, so that people can query it from outside. The first step is to also record the results in the JSON format compatible with the database schema defined in pytorch/test-infra#5839. Existing CSV files remain unchanged. ### Testing The JSON results are uploaded as artifacts to S3 https://github.com/pytorch/pytorch/actions/runs/11809725848/job/32901411180#step:26:13, for example https://gha-artifacts.s3.amazonaws.com/pytorch/pytorch/11809725848/1/artifact/test-jsons-test-pr_time_benchmarks-1-1-linux.g4dn.metal.nvidia.gpu_32901411180.zip Pull Request resolved: pytorch#140493 Approved by: https://github.com/laithsakka
I'm trying to make this benchmark results available on OSS benchmark database, so that people can query it from outside. The first step is to also record the results in the JSON format compatible with the database schema defined in pytorch/test-infra#5839. Existing CSV files remain unchanged. ### Testing The JSON results are uploaded as artifacts to S3 https://github.com/pytorch/pytorch/actions/runs/11809725848/job/32901411180#step:26:13, for example https://gha-artifacts.s3.amazonaws.com/pytorch/pytorch/11809725848/1/artifact/test-jsons-test-pr_time_benchmarks-1-1-linux.g4dn.metal.nvidia.gpu_32901411180.zip Pull Request resolved: pytorch#140493 Approved by: https://github.com/laithsakka
I'm trying to make this benchmark results available on OSS benchmark database, so that people can query it from outside. The first step is to also record the results in the JSON format compatible with the database schema defined in pytorch/test-infra#5839. Existing CSV files remain unchanged. ### Testing The JSON results are uploaded as artifacts to S3 https://github.com/pytorch/pytorch/actions/runs/11809725848/job/32901411180#step:26:13, for example https://gha-artifacts.s3.amazonaws.com/pytorch/pytorch/11809725848/1/artifact/test-jsons-test-pr_time_benchmarks-1-1-linux.g4dn.metal.nvidia.gpu_32901411180.zip Pull Request resolved: pytorch#140493 Approved by: https://github.com/laithsakka
I'm trying to make this benchmark results available on OSS benchmark database, so that people can query it from outside. The first step is to also record the results in the JSON format compatible with the database schema defined in pytorch/test-infra#5839. Existing CSV files remain unchanged. ### Testing The JSON results are uploaded as artifacts to S3 https://github.com/pytorch/pytorch/actions/runs/11809725848/job/32901411180#step:26:13, for example https://gha-artifacts.s3.amazonaws.com/pytorch/pytorch/11809725848/1/artifact/test-jsons-test-pr_time_benchmarks-1-1-linux.g4dn.metal.nvidia.gpu_32901411180.zip Pull Request resolved: pytorch#140493 Approved by: https://github.com/laithsakka
As defined in https://fburl.com/gdoc/pczummt2, this doesn't create the table, but I want to start keeping the create table SQL query on git
Testing
Manually run the query to create the table on CH https://console.clickhouse.cloud/services/c9b76950-2cf3-4fa0-93bb-94a65ff5f27d/console/database/benchmark/table/oss_ci_benchmark_v3