Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to ignore columns in equality test #737

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,12 @@

# Unreleased
## New features
- XXX ([#XXX](https://github.com/dbt-labs/dbt-utils/issues/XXX), [#XXX](https://github.com/dbt-labs/dbt-utils/pull/XXX))
- Add option to ignore columns in equality test ([#734](https://github.com/dbt-labs/dbt-utils/issues/734), [#737](https://github.com/dbt-labs/dbt-utils/pull/737))
## Fixes
## Quality of life
## Under the hood
## Contributors:
- [@brunocostalopes](https://github.com/brunocostalopes)

# dbt utils v1.1.0
## What's Changed
Expand Down Expand Up @@ -44,7 +45,6 @@
* @dchess made their first contribution in https://github.com/dbt-labs/dbt-utils/pull/748
* @Harmuth94 made their first contribution in https://github.com/dbt-labs/dbt-utils/pull/769


# dbt utils v1.0

## Migration Guide
Expand Down
17 changes: 16 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,21 +114,36 @@ This test supports the `group_by_columns` parameter; see [Grouping in tests](#gr

### equality ([source](macros/generic_tests/equality.sql))

Asserts the equality of two relations. Optionally specify a subset of columns to compare.
Asserts the equality of two relations. Optionally specify a subset of columns to compare or ignore.

**Usage:**

```yaml
version: 2

models:
# compare the entire table
- name: model_name
tests:
- dbt_utils.equality:
compare_model: ref('other_table_name')

# only compare some of the columns
- name: model_name_compare_columns
tests:
- dbt_utils.equality:
compare_model: ref('other_table_name')
compare_columns:
- first_column
- second_column

# compare all columns except the ones on the ignore list
- name: model_name_ignore_columns
tests:
- dbt_utils.equality:
compare_model: ref('other_table_name')
ignore_columns:
- third_column
```

### expression_is_true ([source](macros/generic_tests/expression_is_true.sql))
Expand Down
4 changes: 4 additions & 0 deletions integration_tests/data/schema_tests/data_test_equality_a.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
col_a,col_b,col_c
1,1,3
1,2,1
2,3,3
4 changes: 4 additions & 0 deletions integration_tests/data/schema_tests/data_test_equality_b.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
col_a,col_b,col_c
1,1,2
1,2,2
2,3,2
18 changes: 18 additions & 0 deletions integration_tests/models/generic_tests/schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,24 @@ seeds:
- dbt_utils.not_null_proportion:
at_least: 0.9

- name: data_test_equality_a
tests:
- dbt_utils.equality:
compare_model: ref('data_test_equality_a')
- dbt_utils.equality:
compare_model: ref('data_test_equality_b')
error_if: "<1" #sneaky way to ensure that the test is returning failing rows
warn_if: "<0"
- dbt_utils.equality:
compare_model: ref('data_test_equality_b')
compare_columns:
- col_a
- col_b
- dbt_utils.equality:
compare_model: ref('data_test_equality_b')
ignore_columns:
- col_c

models:
- name: recency_time_included
tests:
Expand Down
29 changes: 24 additions & 5 deletions macros/generic_tests/equality.sql
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
{% test equality(model, compare_model, compare_columns=None) %}
{{ return(adapter.dispatch('test_equality', 'dbt_utils')(model, compare_model, compare_columns)) }}
{% test equality(model, compare_model, compare_columns=None, ignore_columns=None) %}
{{ return(adapter.dispatch('test_equality', 'dbt_utils')(model, compare_model, compare_columns, ignore_columns)) }}
{% endtest %}

{% macro default__test_equality(model, compare_model, compare_columns=None) %}
{% macro default__test_equality(model, compare_model, compare_columns=None, ignore_columns=None) %}

{% set set_diff %}
{%- if compare_columns and ignore_columns -%}
{{ exceptions.raise_compiler_error("Both a compare and an ignore list were provided to the `equality` macro. Only one is allowed") }}
{%- endif -%}

{% set set_diff %}
count(*) + coalesce(abs(
sum(case when which_diff = 'a_minus_b' then 1 else 0 end) -
sum(case when which_diff = 'b_minus_a' then 1 else 0 end)
Expand All @@ -29,7 +33,22 @@ information schema — this allows the model to be an ephemeral model

{%- if not compare_columns -%}
{%- do dbt_utils._is_ephemeral(model, 'test_equality') -%}
{%- set compare_columns = adapter.get_columns_in_relation(model) | map(attribute='quoted') -%}
{%- set compare_columns = adapter.get_columns_in_relation(model) | map(attribute='name') -%}
{%- endif -%}

{%- if ignore_columns -%}
{#-- Lower case ignore columns for easier comparison --#}
{%- set ignore_columns = ignore_columns | map("lower") | list %}

{%- set include_columns = [] %}
{%- for column in compare_columns -%}
{%- if column | lower not in ignore_columns -%}
{% do include_columns.append(column) %}
{%- endif %}
{%- endfor %}

{%- set compare_columns = include_columns %}

{%- endif -%}

{% set compare_cols_csv = compare_columns | join(', ') %}
Expand Down