Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add samples for Data Catalog lookup_entry #2148

Merged
121 changes: 121 additions & 0 deletions datacatalog/cloud-client/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
.. This file is automatically generated. Do not edit this file directly.

Google Cloud Data Catalog Python Samples
===============================================================================

.. image:: https://gstatic.com/cloudssh/images/open-btn.png
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=datacatalog/cloud-client/README.rst


This directory contains samples for Google Cloud Data Catalog. `Google Cloud Data Catalog`_ is a fully managed and scalable metadata management service that empowers organizations to quickly discover, manage, and understand all their data in Google Cloud.




.. _Google Cloud Data Catalog: https://cloud.google.com/data-catalog/docs

Setup
-------------------------------------------------------------------------------


Authentication
++++++++++++++

This sample requires you to have authentication setup. Refer to the
`Authentication Getting Started Guide`_ for instructions on setting up
credentials for applications.

.. _Authentication Getting Started Guide:
https://cloud.google.com/docs/authentication/getting-started

Install Dependencies
++++++++++++++++++++

#. Clone python-docs-samples and change directory to the sample directory you want to use.

.. code-block:: bash

$ git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git

#. Install `pip`_ and `virtualenv`_ if you do not already have them. You may want to refer to the `Python Development Environment Setup Guide`_ for Google Cloud Platform for instructions.

.. _Python Development Environment Setup Guide:
https://cloud.google.com/python/setup

#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.

.. code-block:: bash

$ virtualenv env
$ source env/bin/activate

#. Install the dependencies needed to run the samples.

.. code-block:: bash

$ pip install -r requirements.txt

.. _pip: https://pip.pypa.io/
.. _virtualenv: https://virtualenv.pypa.io/

Samples
-------------------------------------------------------------------------------

Lookup entry
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

.. image:: https://gstatic.com/cloudssh/images/open-btn.png
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=datacatalog/cloud-client/lookup_entry.py,datacatalog/cloud-client/README.rst




To run this sample:

.. code-block:: bash

$ python lookup_entry.py

usage: lookup_entry.py [-h]
project_id
{bigquery-dataset,bigquery-table,pubsub-topic} ...

This application demonstrates how to perform basic operations on entries
with the Cloud Data Catalog API.

For more information, see the README.md under /datacatalog and the
documentation at https://cloud.google.com/data-catalog/docs.

positional arguments:
project_id Your Google Cloud project ID
{bigquery-dataset,bigquery-table,pubsub-topic}
bigquery-dataset Retrieves Data Catalog entry for the given BigQuery
Dataset.
bigquery-table Retrieves Data Catalog entry for the given BigQuery
Table.
pubsub-topic Retrieves Data Catalog entry for the given Pub/Sub
Topic.

optional arguments:
-h, --help show this help message and exit





The client library
-------------------------------------------------------------------------------

This sample uses the `Google Cloud Client Library for Python`_.
You can read the documentation for more details on API usage and use GitHub
to `browse the source`_ and `report issues`_.

.. _Google Cloud Client Library for Python:
https://googlecloudplatform.github.io/google-cloud-python/
.. _browse the source:
https://github.com/GoogleCloudPlatform/google-cloud-python
.. _report issues:
https://github.com/GoogleCloudPlatform/google-cloud-python/issues


.. _Google Cloud SDK: https://cloud.google.com/sdk/
23 changes: 23 additions & 0 deletions datacatalog/cloud-client/README.rst.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# This file is used to generate README.rst

product:
name: Google Cloud Data Catalog
short_name: Data Catalog
url: https://cloud.google.com/data-catalog/docs
description: >
`Google Cloud Data Catalog`_ is a fully managed and scalable metadata
management service that empowers organizations to quickly discover, manage,
and understand all their data in Google Cloud.

setup:
- auth
- install_deps

samples:
- name: Lookup entry
file: lookup_entry.py
show_help: true

cloud_client_library: true

folder: datacatalog/cloud-client
150 changes: 150 additions & 0 deletions datacatalog/cloud-client/lookup_entry.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
#!/usr/bin/env python

# Copyright 2019 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""This application demonstrates how to perform basic operations on entries
with the Cloud Data Catalog API.

For more information, see the README.md under /datacatalog and the
documentation at https://cloud.google.com/data-catalog/docs.
"""

import argparse


def lookup_bigquery_dataset(project_id, dataset_id):
"""Retrieves Data Catalog entry for the given BigQuery Dataset."""
from google.cloud import datacatalog_v1beta1

datacatalog = datacatalog_v1beta1.DataCatalogClient()

resource_name = '//bigquery.googleapis.com/projects/{}/datasets/{}'\
.format(project_id, dataset_id)

return datacatalog.lookup_entry(linked_resource=resource_name)


def lookup_bigquery_dataset_sql_resource(project_id, dataset_id):
"""Retrieves Data Catalog entry for the given BigQuery Dataset by
sql_resource.
"""
from google.cloud import datacatalog_v1beta1

datacatalog = datacatalog_v1beta1.DataCatalogClient()

sql_resource = 'bigquery.dataset.`{}`.`{}`'.format(project_id, dataset_id)

return datacatalog.lookup_entry(sql_resource=sql_resource)


def lookup_bigquery_table(project_id, dataset_id, table_id):
"""Retrieves Data Catalog entry for the given BigQuery Table."""
from google.cloud import datacatalog_v1beta1

datacatalog = datacatalog_v1beta1.DataCatalogClient()

resource_name = '//bigquery.googleapis.com/projects/{}/datasets/{}' \
'/tables/{}'\
.format(project_id, dataset_id, table_id)

return datacatalog.lookup_entry(linked_resource=resource_name)


def lookup_bigquery_table_sql_resource(project_id, dataset_id, table_id):
"""Retrieves Data Catalog entry for the given BigQuery Table by
sql_resource.
"""
from google.cloud import datacatalog_v1beta1

datacatalog = datacatalog_v1beta1.DataCatalogClient()

sql_resource = 'bigquery.table.`{}`.`{}`.`{}`'.format(
project_id, dataset_id, table_id)

return datacatalog.lookup_entry(sql_resource=sql_resource)


def lookup_pubsub_topic(project_id, topic_id):
"""Retrieves Data Catalog entry for the given Pub/Sub Topic."""
from google.cloud import datacatalog_v1beta1

datacatalog = datacatalog_v1beta1.DataCatalogClient()

resource_name = '//pubsub.googleapis.com/projects/{}/topics/{}'\
.format(project_id, topic_id)

return datacatalog.lookup_entry(linked_resource=resource_name)


def lookup_pubsub_topic_sql_resource(project_id, topic_id):
"""Retrieves Data Catalog entry for the given Pub/Sub Topic by
sql_resource.
"""
from google.cloud import datacatalog_v1beta1

datacatalog = datacatalog_v1beta1.DataCatalogClient()

sql_resource = 'pubsub.topic.`{}`.`{}`'.format(project_id, topic_id)

return datacatalog.lookup_entry(sql_resource=sql_resource)


if __name__ == '__main__':
parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter
)

parser.add_argument('project_id', help='Your Google Cloud project ID')

subparsers = parser.add_subparsers(dest='command')

bigquery_dataset_parser = subparsers.add_parser(
'bigquery-dataset', help=lookup_bigquery_dataset.__doc__)
bigquery_dataset_parser.add_argument('dataset_id')
bigquery_dataset_parser.add_argument('--sql-resource', action='store_true',
help='Perform lookup by SQL Resource')

bigquery_table_parser = subparsers.add_parser(
'bigquery-table', help=lookup_bigquery_table.__doc__)
bigquery_table_parser.add_argument('dataset_id')
bigquery_table_parser.add_argument('table_id')
bigquery_table_parser.add_argument('--sql-resource', action='store_true',
help='Perform lookup by SQL Resource')

pubsub_topic_parser = subparsers.add_parser(
'pubsub-topic', help=lookup_pubsub_topic.__doc__)
pubsub_topic_parser.add_argument('topic_id')
pubsub_topic_parser.add_argument('--sql-resource', action='store_true',
help='Perform lookup by SQL Resource')

args = parser.parse_args()

entry = None

if args.command == 'bigquery-dataset':
lookup_method = lookup_bigquery_dataset_sql_resource \
if args.sql_resource else lookup_bigquery_dataset
entry = lookup_method(args.project_id, args.dataset_id)
elif args.command == 'bigquery-table':
lookup_method = lookup_bigquery_table_sql_resource \
if args.sql_resource else lookup_bigquery_table
entry = lookup_method(args.project_id, args.dataset_id, args.table_id)
elif args.command == 'pubsub-topic':
lookup_method = lookup_pubsub_topic_sql_resource \
if args.sql_resource else lookup_pubsub_topic
entry = lookup_method(args.project_id, args.topic_id)

print(entry.name)
53 changes: 53 additions & 0 deletions datacatalog/cloud-client/lookup_entry_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#!/usr/bin/env python

# Copyright 2019 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import lookup_entry

BIGQUERY_PROJECT = 'bigquery-public-data'
BIGQUERY_DATASET = 'new_york_taxi_trips'
BIGQUERY_TABLE = 'taxi_zone_geom'

PUBSUB_PROJECT = 'pubsub-public-data'
PUBSUB_TOPIC = 'taxirides-realtime'


def test_lookup_bigquery_dataset():
assert lookup_entry.lookup_bigquery_dataset(
BIGQUERY_PROJECT, BIGQUERY_DATASET)


def test_lookup_bigquery_dataset_sql_resource():
assert lookup_entry.lookup_bigquery_dataset_sql_resource(
BIGQUERY_PROJECT, BIGQUERY_DATASET)


def test_lookup_bigquery_table():
assert lookup_entry.lookup_bigquery_table(
BIGQUERY_PROJECT, BIGQUERY_DATASET, BIGQUERY_TABLE)


def test_lookup_bigquery_table_sql_resource():
assert lookup_entry.lookup_bigquery_table_sql_resource(
BIGQUERY_PROJECT, BIGQUERY_DATASET, BIGQUERY_TABLE)


def test_lookup_pubsub_topic():
assert lookup_entry.lookup_pubsub_topic(PUBSUB_PROJECT, PUBSUB_TOPIC)


def test_lookup_pubsub_topic_sql_resource():
assert lookup_entry.lookup_pubsub_topic_sql_resource(
PUBSUB_PROJECT, PUBSUB_TOPIC)
1 change: 1 addition & 0 deletions datacatalog/cloud-client/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
google-cloud-datacatalog==0.1.0