Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Welcome to your new dbt project!

How to run this project

Prerequisites

We will build a project using dbt and a bigquery database, but any other database of your choice could be used. By this stage of the course you should have already:

  • A running warehouse (BigQuery)
  • A set of running pipelines ingesting the project dataset: Taxi Rides NY dataset

You will need to create a dbt cloud account using this link and connect to your warehouse following these instructions.

I used one database 'production' with a schema for local development 'dbt_victoria_mola' and another schema 'master' for production deployment.

Optional: If you feel more comfortable developing locally you could use a local installation of dbt as well. You can follow the official dbt documentation or use a docker image from oficial dbt repo

About the project

This project is based in dbt starter project (generated by running dbt init) Try running the following commands:

  • dbt run
  • dbt test

A project includes the following files:

  • dbt_project.yml: file used to configure the dbt project. If you are using dbt locally, make sure the profile here matches the one setup during installation in ~/.dbt/profiles.yml
  • *.yml files under folders models, data, macros: documentation files
  • csv files in the data folder: these will be our sources, files described above
  • Files inside folder models: The sql files contain the scripts to run our models, this will cover staging, core and a datamarts models. At the end, these models will follow this structure:

image

Workflow

image

Execution

After having installed the required tools and cloning this repo, execute the following commnads:

  1. Change into the project's directory from the command line: $ cd [..]/taxi_rides_ny
  2. Load the CSVs into the database. This materializes the CSVs as tables in your target schema: $ dbt seed
  3. Run the models: $ dbt run
  4. Test your data: $ dbt test Alternative: use $ dbt build to execute with one command the 3 steps above together
  5. Generate documentation for the project: $ dbt docs generate
  6. View the documentation for the project, this step should open the documentation page on a webserver, but it can also be accessed from http://localhost:8080 : $ dbt docs serve

dbt resources:

  • Learn more about dbt in the docs
  • Check out Discourse for commonly asked questions and answers
  • Join the chat on Slack for live discussions and support
  • Find dbt events near you
  • Check out the blog for the latest news on dbt's development and best practices