Skip to content

Commit

Permalink
Merge pull request #8456 from GoogleCloudPlatform/python-datalabeling…
Browse files Browse the repository at this point in the history
…-migration

Migrate code from googleapis/python-datalabeling
  • Loading branch information
dandhlee authored Nov 9, 2022
2 parents c6b6bc1 + b9844f5 commit dba12bc
Show file tree
Hide file tree
Showing 25 changed files with 1,652 additions and 0 deletions.
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
/container/**/* @GoogleCloudPlatform/dee-platform-ops @GoogleCloudPlatform/python-samples-reviewers
/data-science-onramp/ @leahecole @bradmiro @GoogleCloudPlatform/python-samples-reviewers
/dataflow/**/* @davidcavazos @GoogleCloudPlatform/python-samples-reviewers
/datalabeling/**/* @GoogleCloudPlatform/python-samples-reviewers @ivanmkc
/datastore/**/* @GoogleCloudPlatform/cloud-native-db-dpes @GoogleCloudPlatform/python-samples-reviewers
/dns/**/* @GoogleCloudPlatform/python-samples-reviewers
/endpoints/**/* @GoogleCloudPlatform/python-samples-reviewers
Expand Down
5 changes: 5 additions & 0 deletions .github/blunderbuss.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,11 @@ assign_issues_by:
- 'api: translate'
to:
- nicain
- labels:
- 'api: datalabeling'
to:
- GoogleCloudPlatform/python-samples-reviewers
- ivanmkc
- labels:
- 'api: monitoring'
to:
Expand Down
1 change: 1 addition & 0 deletions datalabeling/AUTHORING_GUIDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
See https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/AUTHORING_GUIDE.md
1 change: 1 addition & 0 deletions datalabeling/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
See https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/CONTRIBUTING.md
78 changes: 78 additions & 0 deletions datalabeling/snippets/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
.. This file is automatically generated. Do not edit this file directly.
Google Cloud Data Labeling Service Python Samples
===============================================================================

.. image:: https://gstatic.com/cloudssh/images/open-btn.png
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=datalabeling/README.rst


This directory contains samples for Google Cloud Data Labeling Service. `Google Cloud Data Labeling Service`_ allows developers to request having human labelers label a collection of data that you plan to use to train a custom machine learning model.




.. _Google Cloud Data Labeling Service: https://cloud.google.com/data-labeling/docs/

Setup
-------------------------------------------------------------------------------


Authentication
++++++++++++++

This sample requires you to have authentication setup. Refer to the
`Authentication Getting Started Guide`_ for instructions on setting up
credentials for applications.

.. _Authentication Getting Started Guide:
https://cloud.google.com/docs/authentication/getting-started

Install Dependencies
++++++++++++++++++++

#. Clone python-docs-samples and change directory to the sample directory you want to use.

.. code-block:: bash
$ git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
#. Install `pip`_ and `virtualenv`_ if you do not already have them. You may want to refer to the `Python Development Environment Setup Guide`_ for Google Cloud Platform for instructions.

.. _Python Development Environment Setup Guide:
https://cloud.google.com/python/setup

#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.

.. code-block:: bash
$ virtualenv env
$ source env/bin/activate
#. Install the dependencies needed to run the samples.

.. code-block:: bash
$ pip install -r requirements.txt
.. _pip: https://pip.pypa.io/
.. _virtualenv: https://virtualenv.pypa.io/



The client library
-------------------------------------------------------------------------------

This sample uses the `Google Cloud Client Library for Python`_.
You can read the documentation for more details on API usage and use GitHub
to `browse the source`_ and `report issues`_.

.. _Google Cloud Client Library for Python:
https://googlecloudplatform.github.io/google-cloud-python/
.. _browse the source:
https://github.com/GoogleCloudPlatform/google-cloud-python
.. _report issues:
https://github.com/GoogleCloudPlatform/google-cloud-python/issues


.. _Google Cloud SDK: https://cloud.google.com/sdk/
18 changes: 18 additions & 0 deletions datalabeling/snippets/README.rst.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# This file is used to generate README.rst

product:
name: Google Cloud Data Labeling Service
short_name: Cloud Data Labeling
url: https://cloud.google.com/data-labeling/docs/
description: >
`Google Cloud Data Labeling Service`_ allows developers to request having
human labelers label a collection of data that you plan to use to train a
custom machine learning model.

setup:
- auth
- install_deps

cloud_client_library: true

folder: datalabeling
84 changes: 84 additions & 0 deletions datalabeling/snippets/create_annotation_spec_set.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
#!/usr/bin/env python

# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import argparse
import os

from google.api_core.client_options import ClientOptions


# [START datalabeling_create_annotation_spec_set_beta]
def create_annotation_spec_set(project_id):
"""Creates a data labeling annotation spec set for the given
Google Cloud project.
"""
from google.cloud import datalabeling_v1beta1 as datalabeling

client = datalabeling.DataLabelingServiceClient()
# [END datalabeling_create_annotation_spec_set_beta]
# If provided, use a provided test endpoint - this will prevent tests on
# this snippet from triggering any action by a real human
if "DATALABELING_ENDPOINT" in os.environ:
opts = ClientOptions(api_endpoint=os.getenv("DATALABELING_ENDPOINT"))
client = datalabeling.DataLabelingServiceClient(client_options=opts)
# [START datalabeling_create_annotation_spec_set_beta]

project_path = f"projects/{project_id}"

annotation_spec_1 = datalabeling.AnnotationSpec(
display_name="label_1", description="label_description_1"
)

annotation_spec_2 = datalabeling.AnnotationSpec(
display_name="label_2", description="label_description_2"
)

annotation_spec_set = datalabeling.AnnotationSpecSet(
display_name="YOUR_ANNOTATION_SPEC_SET_DISPLAY_NAME",
description="YOUR_DESCRIPTION",
annotation_specs=[annotation_spec_1, annotation_spec_2],
)

response = client.create_annotation_spec_set(
request={"parent": project_path, "annotation_spec_set": annotation_spec_set}
)

# The format of the resource name:
# project_id/{project_id}/annotationSpecSets/{annotationSpecSets_id}
print("The annotation_spec_set resource name: {}".format(response.name))
print("Display name: {}".format(response.display_name))
print("Description: {}".format(response.description))
print("Annotation specs:")
for annotation_spec in response.annotation_specs:
print("\tDisplay name: {}".format(annotation_spec.display_name))
print("\tDescription: {}\n".format(annotation_spec.description))

return response


# [END datalabeling_create_annotation_spec_set_beta]


if __name__ == "__main__":
parser = argparse.ArgumentParser(
description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter
)

parser.add_argument("--project-id", help="Project ID. Required.", required=True)

args = parser.parse_args()

create_annotation_spec_set(args.project_id)
53 changes: 53 additions & 0 deletions datalabeling/snippets/create_annotation_spec_set_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#!/usr/bin/env python

# Copyright 2022 Google, Inc
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os

import backoff
from google.api_core.exceptions import ServerError
import pytest

import create_annotation_spec_set
import testing_lib

PROJECT_ID = os.getenv("GOOGLE_CLOUD_PROJECT")


@pytest.fixture(scope="module")
def cleaner():
resource_names = []

yield resource_names

for resource_name in resource_names:
testing_lib.delete_annotation_spec_set(resource_name)


@pytest.mark.skip(reason="service is limited due to covid")
def test_create_annotation_spec_set(cleaner, capsys):
@backoff.on_exception(
backoff.expo, ServerError, max_time=testing_lib.RETRY_DEADLINE
)
def run_sample():
return create_annotation_spec_set.create_annotation_spec_set(PROJECT_ID)

response = run_sample()

# For cleanup
cleaner.append(response.name)

out, _ = capsys.readouterr()
assert "The annotation_spec_set resource name:" in out
96 changes: 96 additions & 0 deletions datalabeling/snippets/create_instruction.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
#!/usr/bin/env python

# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import argparse
import os

from google.api_core.client_options import ClientOptions


# [START datalabeling_create_instruction_beta]
def create_instruction(project_id, data_type, instruction_gcs_uri):
"""Creates a data labeling PDF instruction for the given Google Cloud
project. The PDF file should be uploaded to the project in
Google Cloud Storage.
"""
from google.cloud import datalabeling_v1beta1 as datalabeling

client = datalabeling.DataLabelingServiceClient()
# [END datalabeling_create_instruction_beta]
# If provided, use a provided test endpoint - this will prevent tests on
# this snippet from triggering any action by a real human
if "DATALABELING_ENDPOINT" in os.environ:
opts = ClientOptions(api_endpoint=os.getenv("DATALABELING_ENDPOINT"))
client = datalabeling.DataLabelingServiceClient(client_options=opts)
# [START datalabeling_create_instruction_beta]

project_path = f"projects/{project_id}"

pdf_instruction = datalabeling.PdfInstruction(gcs_file_uri=instruction_gcs_uri)

instruction = datalabeling.Instruction(
display_name="YOUR_INSTRUCTION_DISPLAY_NAME",
description="YOUR_DESCRIPTION",
data_type=data_type,
pdf_instruction=pdf_instruction,
)

operation = client.create_instruction(
request={"parent": project_path, "instruction": instruction}
)

result = operation.result()

# The format of the resource name:
# project_id/{project_id}/instruction/{instruction_id}
print("The instruction resource name: {}".format(result.name))
print("Display name: {}".format(result.display_name))
print("Description: {}".format(result.description))
print("Create time:")
print("\tseconds: {}".format(result.create_time.timestamp_pb().seconds))
print("\tnanos: {}".format(result.create_time.timestamp_pb().nanos))
print("Data type: {}".format(datalabeling.DataType(result.data_type).name))
print("Pdf instruction:")
print("\tGcs file uri: {}\n".format(result.pdf_instruction.gcs_file_uri))

return result


# [END datalabeling_create_instruction_beta]


if __name__ == "__main__":
parser = argparse.ArgumentParser(
description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter
)

parser.add_argument("--project-id", help="Project ID. Required.", required=True)

parser.add_argument(
"--data-type",
help="Data type. Only support IMAGE, VIDEO, TEXT and AUDIO. Required.",
required=True,
)

parser.add_argument(
"--instruction-gcs-uri",
help="The URI of Google Cloud Storage of the instruction. Required.",
required=True,
)

args = parser.parse_args()

create_instruction(args.project_id, args.data_type, args.instruction_gcs_uri)
Loading

0 comments on commit dba12bc

Please sign in to comment.