Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equality exclude columns #829

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@
--->

# Unreleased

## Enhancements
- Added an optional `exclude_columns` to the `equality` test. ([#828](https://github.com/dbt-labs/dbt-utils/issues/828))

## Fixes
- deduplicate macro for Databricks now uses the QUALIFY clause, which fixes NULL columns issues from the default natural join logic
- deduplicate macro for Redshift now uses the QUALIFY clause, which fixes NULL columns issues from the default natural join logic
Expand Down
33 changes: 30 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,15 +114,20 @@ This test supports the `group_by_columns` parameter; see [Grouping in tests](#gr

### equality ([source](macros/generic_tests/equality.sql))

Asserts the equality of two relations. Optionally specify a subset of columns to compare.
Asserts the equality of two relations. Optionally specify a subset of columns to compare or to exclude.

**Usage:**

```yaml
version: 2

models:
- name: model_name
tests:
- dbt_utils.equality:
compare_model: ref('other_table_name')
```

**With `compare_columns` (optional):**
```yaml
tests:
- dbt_utils.equality:
compare_model: ref('other_table_name')
Expand All @@ -131,6 +136,28 @@ models:
- second_column
```

*Note:* The compare columns are case-insensitive (input uppercase or lowercase and it will work!).
If your adapter is Snowflake and the columns of your model are quoted (bad idea!), you should quote the compare columns like so:
```yaml
compare_columns:
- '"first_column"'
- '"second_column"'
```


**With `exclude_columns` (optional):**
```yaml
tests:
- dbt_utils.equality:
compare_model: ref('other_table_name')
exclude_columns:
- third_column
- fourth_column
```

*Note:* The exclude columns are case-insensitive (input uppercase or lowercase and it will work!).


### expression_is_true ([source](macros/generic_tests/expression_is_true.sql))

Asserts that a valid SQL expression is true for all records. This is useful when checking integrity across columns.
Expand Down
8 changes: 8 additions & 0 deletions integration_tests/models/generic_tests/schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,14 @@ models:
- last_name
- email

- name: test_equal_column_exclude
tests:
- dbt_utils.equality:
compare_model: ref('data_people')
exclude_columns:
- first_name
- last_name

- name: test_fewer_rows_than
tests:
- dbt_utils.fewer_rows_than:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{{ config(materialized='table') }}

select

id,
'incorrect_name' as first_name,
last_name,
email,
ip_address,
created_at,
is_active

from {{ ref('data_people') }}
31 changes: 25 additions & 6 deletions macros/generic_tests/equality.sql
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
{% test equality(model, compare_model, compare_columns=None) %}
{{ return(adapter.dispatch('test_equality', 'dbt_utils')(model, compare_model, compare_columns)) }}
{% test equality(model, compare_model, compare_columns=None, exclude_columns=None) %}
{{ return(adapter.dispatch('test_equality', 'dbt_utils')(model, compare_model, compare_columns, exclude_columns)) }}
{% endtest %}

{% macro default__test_equality(model, compare_model, compare_columns=None) %}
{% macro default__test_equality(model, compare_model, compare_columns=None, exclude_columns=None) %}

{% set set_diff %}
count(*) + coalesce(abs(
Expand All @@ -27,12 +27,31 @@ If the compare_cols arg is provided, we can run this test without querying the
information schema — this allows the model to be an ephemeral model
-#}

{%- if not compare_columns -%}
{%- if compare_columns -%}
{%- set should_quote = False -%}
{%- else -%}
{%- do dbt_utils._is_ephemeral(model, 'test_equality') -%}
{%- set compare_columns = adapter.get_columns_in_relation(model) | map(attribute='quoted') -%}
{%- set compare_columns = adapter.get_columns_in_relation(model) | map(attribute='name') | list -%}
{%- set should_quote = True -%}
{%- endif -%}

{% set compare_cols_csv = compare_columns | join(', ') %}
{%- if exclude_columns -%}
{%- set exclude_columns_lower = exclude_columns | map('lower') | list -%}
{%- set compare_columns_final = [] -%}
{%- for column_name in compare_columns -%}
{%- if column_name | lower not in exclude_columns_lower -%}
{%- do compare_columns_final.append(column_name) -%}
{%- endif -%}
{%- endfor -%}
{%- set compare_columns = compare_columns_final -%}
{%- endif -%}

{% if should_quote %}
{% set compare_cols_csv = get_quoted_csv(compare_columns) %}
{% else %}
{% set compare_cols_csv = compare_columns | join(', ') %}
{% endif %}


with a as (

Expand Down