[CT-2176] [Feature] Report bytes processed for tests #559
Labels
feature:cost-reduction
Issues related to cost tracking in BigQuery
type:enhancement
New feature or request
Is this your first time submitting a feature request?
Describe the feature
As @elyobo commented in this closed issue #14 (comment), it would be very nice to see the bytes processed by running tests in BigQuery.
Tests can consume a lot of bytes, especially when running tests like
unique
andnot_null
withoutwhere
, where they scan the whole table.Since BQ billing is based on these bytes, you can get surprised by an expensive bill because you have no idea of the costs of your tests when developing. So, it would be fantastic to see the cost of each test in the logs as we see for the models.
Describe alternatives you've considered
Currently, the information of
bytes_processed
for models is taken fromtotal_bytes_processed
from thequery_job
object.https://github.com/dbt-labs/dbt-bigquery/blob/main/dbt/adapters/bigquery/connections.py#L491
I am assuming the
query_job
information is queried fromINFORMATION_SCHEMA.JOBS
. I checked it and jobs that run tests also return a value fortotal_bytes_processed
.I don't understand why in
run_results.json
tests don't return the bytes processed, as I didn't find any restriction in this codehttps://github.com/dbt-labs/dbt-bigquery/blob/main/dbt/adapters/bigquery/connections.py
Maybe I am looking at the wrong code.
Anyway, the information is available and we already do this for models, so I think we can do this for tests.
Who will this benefit?
People working with dbt in BigQuery as they can monitor better the costs caused by dbt tests.
Are you interested in contributing this feature?
Yes, I am interested. I am not sure where the information for test results is written.
Anything else?
No response
The text was updated successfully, but these errors were encountered: