Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional where clause to get_column_values #583

Merged
merged 10 commits into from
May 12, 2022

Conversation

epapineau
Copy link
Contributor

@epapineau epapineau commented May 11, 2022

Closes #511 closes #582

This is a:

  • documentation update
  • bug fix with no breaking changes
  • new functionality
  • a breaking change

All pull requests from community contributors should target the main branch (default).

Description & motivation

Closes #511 & redundant #582. Adds a where clause parameter to the get_column_values macro to enable filtering the returned columns.

Checklist

  • I have verified that these changes work locally on the following warehouses (Note: it's okay if you do not have access to all warehouses, this helps us understand what has been covered)
    • BigQuery
    • Postgres
    • Redshift
    • Snowflake
  • I followed guidelines to ensure that my changes will work on "non-core" adapters by:
    • dispatching any new macro(s) so non-core adapters can also use them (e.g. the star() source)
    • using the limit_zero() macro in place of the literal string: limit 0
    • using dbt_utils.type_* macros instead of explicit datatypes (e.g. dbt_utils.type_timestamp() instead of TIMESTAMP
  • I have updated the README.md (if applicable)
  • I have added tests & descriptions to my models (and macros if applicable)
  • I have added an entry to CHANGELOG.md

@epapineau epapineau requested review from joellabes and dbeatty10 and removed request for joellabes May 11, 2022 19:40
Copy link
Contributor

@dbeatty10 dbeatty10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution, @epapineau !

Left a few small comments. I'd love to do two things:

  1. Move the where parameter to the end to avoid any possibility of breaking changes for anyone
  2. Add an integration test to verify that it indeeds filters things as expected

Let me know if I can help with either!

CHANGELOG.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
macros/sql/get_column_values.sql Outdated Show resolved Hide resolved
macros/sql/get_column_values.sql Outdated Show resolved Hide resolved
@epapineau
Copy link
Contributor Author

@dbeatty10 made both requested updates~ Thanks again for your help

@epapineau epapineau requested a review from dbeatty10 May 12, 2022 00:10
Copy link
Contributor

@dbeatty10 dbeatty10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integration test looks great! 🏅

Two small things to adjust in the changelog. Other than that, this looks good to go.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@dbeatty10 dbeatty10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, Elise! ⭐

@dbeatty10 dbeatty10 merged commit e09fa7d into main May 12, 2022
@edmuthiah
Copy link

Hi @epapineau @dbeatty10

Thanks for your contribution to this issue! An awesome addition.

I was just wondering how I could use the where parameter. I'm looking to get values from column A given column B is equal to a particular value.

Here's my current macro based on your merge:

{% macro get_green_columns(input_table) %}
{% set input_table=input_table %}
{% set green_columns=dbt_utils.get_column_values(table=ref('ref_all_green_columns'), column='column_name', where="table_name = '"~input_table~"'") %}
{{ return(green_columns) }}
{% endmacro %}

Thanks for the help 😀

@dbeatty10
Copy link
Contributor

{% macro get_green_columns(input_table) %}
{% set input_table=input_table %}
{% set green_columns=dbt_utils.get_column_values(table=ref('ref_all_green_columns'), column='column_name', where="table_name = '"~input_table~"'") %}
{{ return(green_columns) }}
{% endmacro %}

Hi @ed-muthiah 👋 Did the example you posted not work for you? It appeared to work as-is for me; see below.

I created a seed named seeds/ref_all_green_columns.csv with this content:

id,table_name,column_name
1,green,Q
2,green,R
3,green,S
4,blue,S
5,blue,T
6,blue,U

And copied and pasted your macro into macros/green.sql:

{% macro get_green_columns(input_table) %}
{% set input_table=input_table %}
{% set green_columns=dbt_utils.get_column_values(table=ref('ref_all_green_columns'), column='column_name', where="table_name = '"~input_table~"'") %}
{{ return(green_columns) }}
{% endmacro %}

And then I created a file named analyses/test_green.sql:

{{ get_green_columns("green") }}

Seeding and compiling using a terminal:

dbt seed  
dbt compile

This yielded the following text in target/compiled/YOUR_PROJECT_NAME_HERE/analyses/test_green.sql:

['Q', 'R', 'S']

@edmuthiah
Copy link

Thanks, my issue was just a naming issue! Can confirm that your example above works. It would be great to add it to the readme as an example.

@dbeatty10
Copy link
Contributor

@ed-muthiah I created this issue to add it to the readme as an example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add filter to get_column_values Add condition parameter to get_column_values
3 participants