-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(bigquery): add
create_bqstorage_client
param to to_dataframe
…
… and `to_arrow` (#9573) * feat(bigquery): add `create_bqstorage_client` param to `to_dataframe` and `to_arrow` When the `create_bqstorage_client` parameter is set to `True`, the BigQuery client constructs a BigQuery Storage API client for you. This removes the need for boilerplate code to manually construct both clients explitly with the same credentials. Does this make the `bqstorage_client` parameter unnecessary? In most cases, yes, but there are a few cases where we'll want to continue using it. * When partner tools use `to_dataframe`, they should continue to use `bqstorage_client` so that they can set the correct amended user-agent strings. * When a developer needs to override the default API endpoint for the BQ Storage API, they'll need to manually supply a `bqstorage_client`. * test for BQ Storage API usage in samples tests. * fix: close bqstorage client if created by to_dataframe/to_arrow * chore: blacken * doc: update versionadded * doc: update versionadded
- Loading branch information
Showing
9 changed files
with
431 additions
and
37 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Copyright 2019 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# https://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
def download_public_data(client): | ||
|
||
# [START bigquery_pandas_public_data] | ||
# TODO(developer): Import the client library. | ||
# from google.cloud import bigquery | ||
|
||
# TODO(developer): Construct a BigQuery client object. | ||
# client = bigquery.Client() | ||
|
||
# TODO(developer): Set table_id to the fully-qualified table ID in standard | ||
# SQL format, including the project ID and dataset ID. | ||
table_id = "bigquery-public-data.usa_names.usa_1910_current" | ||
|
||
# Use the BigQuery Storage API to speed-up downloads of large tables. | ||
dataframe = client.list_rows(table_id).to_dataframe(create_bqstorage_client=True) | ||
|
||
print(dataframe.info()) | ||
# [END bigquery_pandas_public_data] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Copyright 2019 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# https://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
def download_public_data_sandbox(client): | ||
|
||
# [START bigquery_pandas_public_data_sandbox] | ||
# TODO(developer): Import the client library. | ||
# from google.cloud import bigquery | ||
|
||
# TODO(developer): Construct a BigQuery client object. | ||
# client = bigquery.Client() | ||
|
||
# `SELECT *` is an anti-pattern in BigQuery because it is cheaper and | ||
# faster to use the BigQuery Storage API directly, but BigQuery Sandbox | ||
# users can only use the BigQuery Storage API to download query results. | ||
query_string = "SELECT * FROM `bigquery-public-data.usa_names.usa_1910_current`" | ||
|
||
# Use the BigQuery Storage API to speed-up downloads of large tables. | ||
dataframe = client.query(query_string).to_dataframe(create_bqstorage_client=True) | ||
|
||
print(dataframe.info()) | ||
# [END bigquery_pandas_public_data_sandbox] |
Oops, something went wrong.