Skip to content

dpguthrie/dbtc

Repository files navigation

An unaffiliated python interface for dbt Cloud APIs

Coverage Package version Downloads


Documentation: https://dbtc.dpguthrie.com

Interactive Demo: https://dbtc-python.streamlit.app/

Source Code: https://github.com/dpguthrie/dbtc

V2 Docs: https://docs.getdbt.com/dbt-cloud/api-v2

V3 Docs: https://docs.getdbt.com/dbt-cloud/api-v3


Overview

dbtc is an unaffiliated python interface to various dbt Cloud API endpoints.

This library acts as a convenient interface to two different APIs that dbt Cloud offers:

  • Cloud API: This is a REST API that exposes endpoints that allow users to programatically create, read, update, and delete resources within their dbt Cloud Account.
  • Metadata API: This is a GraphQL API that exposes metadata generated from a job run within dbt Cloud.

Requirements

Python 3.8+

  • Requests - The elegant and simple HTTP library for Python, built for human beings.
  • sgqlc - Simple GraphQL Client
  • Typer - Library for building CLI applications

Installation

pip install dbtc

Or better yet, use uv

uv pip install dbtc

Basic Usage

Python

The interface to both APIs are located in the dbtCloudClient class.

The example below shows how you use the cloud property on an instance of the dbtCloudClient class to to access a method, trigger_job_from_failure, that allows you to restart a job from its last point of failure.

from dbtc import dbtCloudClient

# Assumes that DBT_CLOUD_SERVICE_TOKEN env var is set
client = dbtCloudClient()

account_id = 1
job_id = 1
payload = {'cause': 'Restarting from failure'}

run = client.cloud.trigger_job_from_failure(
    account_id,
    job_id,
    payload,
    should_poll=False,
)

# This returns a dictionary containing two keys
run['data']
run['status']

Similarly, use the metadata property to retrieve information from the Discovery API. Here's how you could retrieve all of the metrics for your project.

from dbtc import dbtCloudClient

client = dbtCloudClient()
query = '''
query ($environmentId: BigInt!, $first: Int!) {
  environment(id: $environmentId) {
    definition {
      metrics(first: $first) {
        edges {
          node {
            name
            description
            type
            formula
            filter
            tags
            parents {
              name
              resourceType
            }
          }
        }
      }
    }
  }
}
'''
variables = {'environmentId': 1, 'first': 500}
data = client.metadata.query(query, variables)

# Data will be in the edges key, which will be a list of nodes
nodes = data['data']['definition']['metrics']['edges']
for node in nodes:
    # node is a dictionary
    node_name = node['name']
    ...

If you're unfamiliar either with the Schema to query or even how to write a GraphQL query, I highly recommend going to the dbt Cloud Discovery API playground. You'll be able to interactively explore the Schema while watching it write a GraphQL query for you!

CLI

The CLI example below will map to the python cloud example above:

dbtc trigger-job-from-failure \
    --account-id 1 \
    --job-id 1 \
    --payload '{"cause": "Restarting from failure"}' \
    --no-should-poll

Similarly, for the metadata example above (assuming that you've put both the query and variables argument into variables):

dbtc query --query $query --variables $variables

If not setting your service token as an environment variable, do the following:

dbtc --token this_is_my_token query --query $query --variables $variables

License

This project is licensed under the terms of the MIT license.