Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(samples): uses function (create_job) more appropriate to the described sample intent #1309

Merged
merged 20 commits into from
Sep 2, 2022
Merged
Changes from 17 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 35 additions & 15 deletions samples/create_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,33 +13,53 @@
# limitations under the License.

import typing
from typing import Union

if typing.TYPE_CHECKING:
from google.cloud import bigquery
from google.cloud.bigquery import LoadJob, CopyJob, ExtractJob, QueryJob


def create_job() -> "bigquery.QueryJob":
def create_job() -> "Union[LoadJob, CopyJob, ExtractJob, QueryJob]":
chalmerlowe marked this conversation as resolved.
Show resolved Hide resolved

# [START bigquery_create_job]
from google.cloud import bigquery

# Construct a BigQuery client object.
client = bigquery.Client()

query_job = client.query(
"SELECT country_name from `bigquery-public-data.utility_us.country_code_iso`",
# Explicitly force job execution to be routed to a specific processing
# location.
location="US",
# Specify a job configuration to set optional job resource properties.
job_config=bigquery.QueryJobConfig(
labels={"example-label": "example-value"}, maximum_bytes_billed=1000000
),
# The client libraries automatically generate a job ID. Override the
# generated ID with either the job_id_prefix or job_id parameters.
job_id_prefix="code_sample_",
query_job = client.create_job(
# Specify a job configuration, providing a query
# and/or optional job resource properties, as needed.
# The job instance can be a LoadJob, CopyJob, ExtractJob, QueryJob
# Here, we demonstrate a "query" job.
# Reference: https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client.create_job
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not going to render well with the long line of text on Sample Browser or even seeing this on GitHub. I like the link to the documentation though, could you perhaps add this link on https://cloud.google.com/bigquery/docs/samples/bigquery-create-job page instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I am confused: your link takes us to the page where this sample is displayed.
Is that intentional? That page does not currently provide additional information regarding the four types of jobs available.

Relatedly: in our renderings on the Sample Browser, is there a way to create standard hyperlinks within the code blocks displayed in the screen?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It currently does not, but once your PR updates it the page will also be updated.

Tried sifting through https://googlecloudplatform.github.io/samples-style-guide/ and we currently don't provide any guidance on how hyperlinks can be should they be too long... Let me get back to you on this after I ask some folks around.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the style guide a bit. thanks for linking to it. good reminder for me that it exists.

This https://googlecloudplatform.github.io/samples-style-guide/#clients item has a Python snippet with a really long URL in the code sample, just like the one I included.
¯_(ツ)_/¯

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could remove the ending anchor (#google.cloud...) to make the URL a bit smaller.

I recall that we have a g.co/cloud URL shortener for cloud.google.com pages (e.g. cloud.google.com/bigquery becomes g.co/cloud/bigquery, which isn't all that much shorter but we occasionally used it) I wonder if we could get g.co/bqpython or something similar pointing to the latest API reference?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I submitted a request for a short link:

g.co/bqpython > https://googleapis.dev/python/bigquery/latest/

It has to go through an approval process.

I would suggest that we not hold off on issuing this PR. Especially in light of the fact that even the style guide in code samples has examples of extremely long URLs, as noted in the comment above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into the shortlink process! I wasn't aware of such features. Hope it works out :)

I've asked the samples team for guidance, however it likely will take a long time for us to come up with a feasible solution, and will likely involve multiple teams. For now, it's adding more benefits so I'm happy to move forward as is.

#
# Example use cases for .create_job() include:
# * to retry failed jobs
# * to generate jobs with an experimental API property that hasn't
# been added to one of the manually written job configuration
# classes yet
#
# NOTE: unless it is necessary to create a job in this way, the
# preferred approach is to use one of the dedicated API calls:
# client.query()
# client.extract_table()
# client.copy_table()
# client.load_table_file(), client.load_table_from_dataframe(), etc
job_config={
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a link to https://cloud.google.com/bigquery/docs/reference/rest/v2/Job would be quite helpful here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the link.

"query": {
"query": """
SELECT country_name
FROM `bigquery-public-data.utility_us.country_code_iso`
LIMIT 5
""",
},
"labels": {"example-label": "example-value"},
"maximum_bytes_billed": 10000000,
}
) # Make an API request.

print("Started job: {}".format(query_job.job_id))
print(f"Started job: {query_job.job_id}")
# [END bigquery_create_job]

return query_job