Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt test --store-failures #3316

Merged
merged 2 commits into from
May 27, 2021
Merged

dbt test --store-failures #3316

merged 2 commits into from
May 27, 2021

Conversation

jtcohen6
Copy link
Contributor

@jtcohen6 jtcohen6 commented May 4, 2021

Closes #517
Closes #903
Closes #2593

I also played around with #3265 (print the top 5 failures to stdout, and include in run_results.json). This work here is definitely the prerequisite for that, but we should tackle it separately if we decide it's something we want to do.

Biggest choices

  • This is all turned on via a flag, --store-results
  • Hard-coded default schema for tests: dbt_test__audit. Normally, this means the schema is my_target_schema_dbt_test__audit; if someone is using generate_schema_name_for_env, it will be just dbt_test__audit.
    • Note: Because of the work in Feature: test config parity #3257, users can override this value by setting a tests: +schema config in dbt_project.yml. So I feel less bad about hard coding it.
  • The original proposal in Storing test results in the database #2593 (and my first cut of this) would have dbt totally blow away the test-audit schema each time. As I said here previously: "This statelessness is actually desirable, but we should be immensely careful whenever dbt auto-drops anything." Well, that auto-dropping scared me a bit; now, dbt will drop-if-exists and recreate each test audit table, but not the whole schema. It's not an atomic operation, since that's not the point; it's not quite as dramatic as clobbering the whole schema, but it accomplishes our goal.

Other comments

  • I introduced a new node property, relational(), which is the same as refable() but includes tests IFF flags.STORE_RESULTS is turned on. dbt now uses relational() to decide which nodes inform the schemas it will create and when it will write the relation_name property to the manifest.
  • I rewrote the queries defining a few of dbt's builtin generic tests, so that the failing records are more helpful/interesting.
  • A table is not dropped if its test passes, i.e. if it contains zero rows. We could add that pretty easily, if it's something we want. Worth considering:
    • What if a test has a pass/fail condition other than count(*) = 0? It's possible that a passing test may still have rows in the database that a user would want to inspect
    • Potential bloat to database metadata or block storage. (This isn't additive across invocations, since the test table is dropped and recreated on a subsequent store-failure test, but old/deleted tests would still hang around.)
  • There's no centralized "results" table, as proposed in Storing test results in the database #2593, that contains one row for every test, its status info, and the location of its results table. The question is, do we need this? These days, all of that information is available (and more reliably accessed) from the run_results.json artifact.
Running with dbt=0.20.0-b1
Found 1 model, 3 tests, 0 snapshots, 0 analyses, 162 macros, 0 operations, 0 seed files, 1 source, 0 exposures

22:23:09 | Concurrency: 8 threads (target='bigquery')
22:23:09 |
22:23:09 | 1 of 3 START test my_custom_bespoke.................................. [RUN]
22:23:09 | 2 of 3 START test not_null_my_model_fun.............................. [RUN]
22:23:09 | 3 of 3 START test unique_my_model_fun................................ [RUN]
22:23:14 | 1 of 3 WARN 1 my_custom_bespoke...................................... [WARN 1 in 4.69s]
22:23:14 | 3 of 3 FAIL 1 unique_my_model_fun.................................... [FAIL 1 in 4.91s]
22:23:14 | 2 of 3 PASS not_null_my_model_fun.................................... [PASS in 4.98s]
22:23:14 |
22:23:14 | Finished running 3 tests in 10.98s.

Completed with 1 error and 1 warning:

Failure in test unique_my_model_fun (models/schema.yml)
  Got 1 result, expected 0.

  compiled SQL at target/run/my_project/models/schema.yml/schema_test/unique_my_model_fun.sql

  See test failures:
  --------------------------------------------------------------------------------
  select * from `dbt-test-env`.`dbt_jcohen___dbt_test_audit`.`unique_my_model_fun`
  --------------------------------------------------------------------------------

Warning in test my_custom_bespoke (tests/my_custom_bespoke.sql)
  Got 1 result, expected 0.

  compiled SQL at target/run/my_project/tests/my_custom_bespoke.sql

  See test failures:
  ------------------------------------------------------------------------------
  select * from `dbt-test-env`.`dbt_jcohen___dbt_test_audit`.`my_custom_bespoke`
  ------------------------------------------------------------------------------

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label May 4, 2021
@jtcohen6 jtcohen6 changed the title POC: dbt test --store-failures dbt test --store-failures May 8, 2021
@jtcohen6 jtcohen6 marked this pull request as ready for review May 8, 2021 17:28
@jtcohen6 jtcohen6 requested a review from drewbanin May 10, 2021 14:43
Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments dropped inline! Nice work @jtcohen6!

@@ -64,11 +65,12 @@ def reset():
PARTIAL_PARSE = False
MP_CONTEXT = _get_context()
USE_COLORS = True
STORE_FAILURES = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this definitely feels like the most expedient approach, but by making this a global flag, we lose the ability to specify something like this as a model-level config. Do you think there's a path to implementing this more like the full_refresh config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely!

{% endif %}

{% call statement(auto_begin=True) %}
{{ create_table_as(False, target_relation, sql) }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is 🔥

@@ -49,6 +50,10 @@ def documentable(cls) -> List['NodeType']:
cls.Exposure
]

@classmethod
def relational(cls) -> List['NodeType']:
return cls.refable() + ([cls.Test] if flags.STORE_FAILURES else [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is.... not ideal... i think. Is there a way that we can push this logic around flags.STORE_FAILURES into the caller? Something about changing the behavior of this class method based on a global flag feels pretty fraught

Copy link
Contributor Author

@jtcohen6 jtcohen6 May 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I removed relational as a class method and instead implemented this logic as a should_store_failures boolean property of parsed nodes.

Unfortunately, it's duplicative with logic in the should_store_failures() Jinja macro, which is what's used in the test materialization, and takes after the should_full_refresh() macro (per your comment above).

I don't know a good way around this:

  • I see merit in making the logic really explicit to users, and allowing them to override it if they'd like
  • We need this property accessible in python, for deciding which schemas need to be created as part of the task
  • The places in the codebase where we call Jinja macros from python are... not ideal...

So, if a user overrides should_full_refresh() (Jinja) with custom logic, such as adding some target-specific logic, they will successfully alter the behavior of the materialization / the mutative queries that dbt runs. But, dbt may create an extra schema that it doesn't strictly have to. It's a drawback I'm willing to live with.

cutoff = 30
test_full_name = '{}_{}'.format(test_type, test_name)
if len(test_full_name) <= cutoff:
test_prenom = test_full_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's uh... use english.... for variable names :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh. I decided to reimplement this logic anyway, since it was getting a little confused

@@ -231,6 +242,10 @@ def __init__(
self.compiled_name: str = compiled_name
self.fqn_name: str = fqn_name

# use hashed name as alias if too long
if compiled_name != fqn_name:
self.modifiers['alias'] = compiled_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really clever & really cooll!

@@ -524,8 +524,6 @@ def create_schema(relation: BaseRelation) -> None:

db_schema = (db_lower, schema.lower())
if db_schema not in existing_schemas_lowered:
existing_schemas_lowered.add(db_schema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why this disappeared?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope :) thanks for catching it!

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented May 16, 2021

Updated:

  • Tests accept a store_failures config which can be set in all the usual places. This works like full_refresh: If set, the config takes precedence over the --store-failures flag, so a test either always or never stores its failures. If the config is not set, then the flag takes effect.
  • Cleaned up the test aliasing logic
  • Raise an error if two tests have the same database representation (database.schema.alias)

Known limitations of the current approach:

  • Duplicated logic between should_full_refresh node property and should_full_refresh() Jinja macro (see comment above).
  • Generic test nodes with too-long names (64+ characters) don't use custom alias names or respect the generate_alias_name() macro. Instead, they use an alias guaranteed to be both short enough (<64 char) and unique (containing a hash).
  • Now that database representation (sometimes) matters for tests, configs like database/schema/alias are relevant. Users can specify those configs for tests from dbt_project.yml, but not in the .yml properties that specify a generic test instance. Should we start accepting these as MODIFIER_ARGS? How to handle dbt-bigquery aliases (project, dataset)? I'm willing to punt on this for now

@jtcohen6 jtcohen6 requested a review from kwigley May 16, 2021 18:41
@kwigley kwigley mentioned this pull request May 26, 2021
4 tasks
Copy link
Contributor

@kwigley kwigley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ this is awesome! looks good to me

@jtcohen6 jtcohen6 merged commit c0d757a into develop May 27, 2021
@jtcohen6 jtcohen6 deleted the poc/store-test-failures branch May 27, 2021 19:21
@sphinks
Copy link

sphinks commented Jun 4, 2021

Great feature! Any idea when it will be released?

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Jun 4, 2021

@sphinks This will be in v0.20.0, for which we'll have a first release candidate very shortly

@jranks123
Copy link

Love this, thanks everyone <3

@jranks123
Copy link

@jtcohen6 This seems to imply it only stores failures - are there any plans to also store non-failures, i.e passes?

@jtcohen6
Copy link
Contributor Author

@jranks123 The test table will store whatever the test query returns. In general, test queries pass when they return zero rows—so the table will still be created, it will just have nothing in it.

In v0.20.0, tests can also have a few new configs—fail_calc and error_if—such that it's possible to return rows from a test query and still pass, so long as the fail_calc does not meet the error_if criterion. In that case, --store-failures would still cause the test to store the results of its query in a table, even though the test does not ultimately raise an error.

@jtcohen6 jtcohen6 mentioned this pull request Aug 10, 2021
4 tasks
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
automatic commit by git-black, original commits:
  c0d757a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants