Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add api_method parameter to Client.query to select INSERT or QUERY API #967

Merged
merged 25 commits into from
Dec 2, 2021

Conversation

tswast
Copy link
Contributor

@tswast tswast commented Sep 13, 2021

This PR is the first step in enabling the "fast query path". At the moment, selecting api_method="QUERY" is actually likely to be slower because it fetches the first page of results and discards them. A PR will follow-up to cache this first page for use by the row iterator.

This commit only refactors to allow jobs.insert to be selected.
Supporting jobs.query will require more transformations to QueryJobConfig,
QueryJob, and RowIterator.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Towards #589 🦕

@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Sep 13, 2021
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Sep 13, 2021
@tswast
Copy link
Contributor Author

tswast commented Sep 17, 2021

Even though this isn't a breaking change, it refactors enough that maybe this should target v3 so as not to delay v3 any longer.

@tswast tswast changed the base branch from main to v3 October 5, 2021 15:41
@snippet-bot
Copy link

snippet-bot bot commented Oct 5, 2021

No region tags are edited in this PR.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@google-cla google-cla bot added cla: no This human has *not* signed the Contributor License Agreement. and removed cla: yes This human has signed the Contributor License Agreement. labels Oct 5, 2021
@tswast tswast changed the title feat: add api_method parameter to Client.query to select insert or query API feat: add api_method parameter to Client.query to select INSERT or QUERY API Oct 6, 2021
@google-cla

This comment has been minimized.

tswast added 7 commits October 7, 2021 16:43
… or `query` API

Work in Progress. This commit only refactors to allow jobs.insert to be selected.
Supporting jobs.query will require more transformations to QueryJobConfig,
QueryJob, and RowIterator.
@tswast tswast force-pushed the issue589-queries-endpoint branch from 7b9457c to 2e4af92 Compare October 7, 2021 21:43
@googleapis googleapis deleted a comment from google-cla bot Oct 7, 2021
@google-cla google-cla bot added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Oct 7, 2021
@googleapis googleapis deleted a comment from google-cla bot Oct 7, 2021
@googleapis googleapis deleted a comment from google-cla bot Oct 7, 2021
@googleapis googleapis deleted a comment from google-cla bot Oct 7, 2021
@tswast tswast marked this pull request as ready for review November 18, 2021 23:05
@tswast tswast requested a review from a team November 18, 2021 23:05
@tswast tswast requested a review from a team as a code owner November 18, 2021 23:05
@tswast tswast requested a review from loferris November 18, 2021 23:05
@tswast
Copy link
Contributor Author

tswast commented Nov 18, 2021

I've just synced to the latest v3 branch. Unit tests and query system tests are passing. I think this is ready for review.

Note: This won't actually result in any performance improvements until the # TODO: https://github.com/googleapis/python-bigquery/issues/589 in _job_helpers.py is resolved. Since that requires more hacks to the QueryJob class, I figure that's worth postponing to a separate PR. As-is, this PR should be showing the same behavior for api_method="INSERT" and api_method="QUERY" options.

@tswast

This comment has been minimized.

@tswast

This comment has been minimized.

tests/system/conftest.py Outdated Show resolved Hide resolved
@tswast tswast requested a review from plamut November 19, 2021 16:25
@tswast

This comment has been minimized.

Copy link
Contributor

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really promising and structured, mostly just minor remarks and questions.

google/cloud/bigquery/_job_helpers.py Outdated Show resolved Hide resolved
google/cloud/bigquery/_job_helpers.py Outdated Show resolved Hide resolved
google/cloud/bigquery/_job_helpers.py Outdated Show resolved Hide resolved
tests/system/test_query.py Outdated Show resolved Hide resolved
tests/system/test_query.py Show resolved Hide resolved
tests/unit/test__job_helpers.py Outdated Show resolved Hide resolved
@tswast tswast requested a review from plamut November 22, 2021 16:54
Copy link
Contributor

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

There's one nit, but decide for yourself if it's worth updating the PR.

@@ -3215,6 +3219,11 @@ def query(
called on the job returned. The ``job_retry``
specified here becomes the default ``job_retry`` for
``result()``, where it can also be specified.
api_method (Union[str, enums.QueryApiMethod]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Duplicate annotation info

Copy link
Contributor

@loferris loferris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tswast tswast merged commit 3b3ebff into googleapis:v3 Dec 2, 2021
@tswast tswast deleted the issue589-queries-endpoint branch December 2, 2021 20:15
tswast added a commit that referenced this pull request Mar 29, 2022
deps!: BigQuery Storage and pyarrow are required dependencies (#776)

fix!: use nullable `Int64` and `boolean` dtypes in `to_dataframe` (#786) 

feat!: destination tables are no-longer removed by `create_job` (#891)

feat!: In `to_dataframe`, use `dbdate` and `dbtime` dtypes from db-dtypes package for BigQuery DATE and TIME columns (#972)

fix!: automatically convert out-of-bounds dates in `to_dataframe`, remove `date_as_object` argument (#972)

feat!: mark the package as type-checked (#1058)

feat!: default to DATETIME type when loading timezone-naive datetimes from Pandas (#1061)

feat: add `api_method` parameter to `Client.query` to select `INSERT` or `QUERY` API (#967)

fix: improve type annotations for mypy validation (#1081)

feat: use `StandardSqlField` class for `Model.feature_columns` and `Model.label_columns` (#1117)

docs: Add migration guide from version 2.x to 3.x (#1027)

Release-As: 3.0.0
waltaskew pushed a commit to waltaskew/python-bigquery that referenced this pull request Jul 20, 2022
deps!: BigQuery Storage and pyarrow are required dependencies (googleapis#776)

fix!: use nullable `Int64` and `boolean` dtypes in `to_dataframe` (googleapis#786) 

feat!: destination tables are no-longer removed by `create_job` (googleapis#891)

feat!: In `to_dataframe`, use `dbdate` and `dbtime` dtypes from db-dtypes package for BigQuery DATE and TIME columns (googleapis#972)

fix!: automatically convert out-of-bounds dates in `to_dataframe`, remove `date_as_object` argument (googleapis#972)

feat!: mark the package as type-checked (googleapis#1058)

feat!: default to DATETIME type when loading timezone-naive datetimes from Pandas (googleapis#1061)

feat: add `api_method` parameter to `Client.query` to select `INSERT` or `QUERY` API (googleapis#967)

fix: improve type annotations for mypy validation (googleapis#1081)

feat: use `StandardSqlField` class for `Model.feature_columns` and `Model.label_columns` (googleapis#1117)

docs: Add migration guide from version 2.x to 3.x (googleapis#1027)

Release-As: 3.0.0
abdelmegahedgoogle pushed a commit to abdelmegahedgoogle/python-bigquery that referenced this pull request Apr 17, 2023
…1014)

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
- [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
- [ ] Ensure the tests and linter pass
- [ ] Code coverage does not decrease (if any source code was changed)
- [ ] Appropriate docs were updated (if necessary)

Issue discovered while investigating what properties are needed in googleapis#967
abdelmegahedgoogle pushed a commit to abdelmegahedgoogle/python-bigquery that referenced this pull request Apr 17, 2023
deps!: BigQuery Storage and pyarrow are required dependencies (googleapis#776)

fix!: use nullable `Int64` and `boolean` dtypes in `to_dataframe` (googleapis#786) 

feat!: destination tables are no-longer removed by `create_job` (googleapis#891)

feat!: In `to_dataframe`, use `dbdate` and `dbtime` dtypes from db-dtypes package for BigQuery DATE and TIME columns (googleapis#972)

fix!: automatically convert out-of-bounds dates in `to_dataframe`, remove `date_as_object` argument (googleapis#972)

feat!: mark the package as type-checked (googleapis#1058)

feat!: default to DATETIME type when loading timezone-naive datetimes from Pandas (googleapis#1061)

feat: add `api_method` parameter to `Client.query` to select `INSERT` or `QUERY` API (googleapis#967)

fix: improve type annotations for mypy validation (googleapis#1081)

feat: use `StandardSqlField` class for `Model.feature_columns` and `Model.label_columns` (googleapis#1117)

docs: Add migration guide from version 2.x to 3.x (googleapis#1027)

Release-As: 3.0.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide support for synchronous queries through the v2/projects/{projectId}/queries endpoint
3 participants