-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbt test --store-failures #3316
Conversation
b0ee77d
to
463f059
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments dropped inline! Nice work @jtcohen6!
@@ -64,11 +65,12 @@ def reset(): | |||
PARTIAL_PARSE = False | |||
MP_CONTEXT = _get_context() | |||
USE_COLORS = True | |||
STORE_FAILURES = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this definitely feels like the most expedient approach, but by making this a global flag, we lose the ability to specify something like this as a model-level config. Do you think there's a path to implementing this more like the full_refresh
config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely!
{% endif %} | ||
|
||
{% call statement(auto_begin=True) %} | ||
{{ create_table_as(False, target_relation, sql) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is 🔥
core/dbt/node_types.py
Outdated
@@ -49,6 +50,10 @@ def documentable(cls) -> List['NodeType']: | |||
cls.Exposure | |||
] | |||
|
|||
@classmethod | |||
def relational(cls) -> List['NodeType']: | |||
return cls.refable() + ([cls.Test] if flags.STORE_FAILURES else []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is.... not ideal... i think. Is there a way that we can push this logic around flags.STORE_FAILURES
into the caller? Something about changing the behavior of this class method based on a global flag feels pretty fraught
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I removed relational
as a class method and instead implemented this logic as a should_store_failures
boolean property of parsed nodes.
Unfortunately, it's duplicative with logic in the should_store_failures()
Jinja macro, which is what's used in the test
materialization, and takes after the should_full_refresh()
macro (per your comment above).
I don't know a good way around this:
- I see merit in making the logic really explicit to users, and allowing them to override it if they'd like
- We need this property accessible in python, for deciding which schemas need to be created as part of the task
- The places in the codebase where we call Jinja macros from python are... not ideal...
So, if a user overrides should_full_refresh()
(Jinja) with custom logic, such as adding some target
-specific logic, they will successfully alter the behavior of the materialization / the mutative queries that dbt runs. But, dbt may create an extra schema that it doesn't strictly have to. It's a drawback I'm willing to live with.
cutoff = 30 | ||
test_full_name = '{}_{}'.format(test_type, test_name) | ||
if len(test_full_name) <= cutoff: | ||
test_prenom = test_full_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's uh... use english.... for variable names :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh. I decided to reimplement this logic anyway, since it was getting a little confused
@@ -231,6 +242,10 @@ def __init__( | |||
self.compiled_name: str = compiled_name | |||
self.fqn_name: str = fqn_name | |||
|
|||
# use hashed name as alias if too long | |||
if compiled_name != fqn_name: | |||
self.modifiers['alias'] = compiled_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really clever & really cooll!
@@ -524,8 +524,6 @@ def create_schema(relation: BaseRelation) -> None: | |||
|
|||
db_schema = (db_lower, schema.lower()) | |||
if db_schema not in existing_schemas_lowered: | |||
existing_schemas_lowered.add(db_schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason why this disappeared?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope :) thanks for catching it!
Updated:
Known limitations of the current approach:
|
dd2e647
to
b1fd8b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ this is awesome! looks good to me
…store-test-failures
Great feature! Any idea when it will be released? |
@sphinks This will be in v0.20.0, for which we'll have a first release candidate very shortly |
Love this, thanks everyone <3 |
@jtcohen6 This seems to imply it only stores failures - are there any plans to also store non-failures, i.e passes? |
@jranks123 The test table will store whatever the test query returns. In general, test queries pass when they return zero rows—so the table will still be created, it will just have nothing in it. In v0.20.0, tests can also have a few new configs— |
automatic commit by git-black, original commits: c0d757a
Closes #517
Closes #903
Closes #2593
I also played around with #3265 (print the top 5 failures to stdout, and include in
run_results.json
). This work here is definitely the prerequisite for that, but we should tackle it separately if we decide it's something we want to do.Biggest choices
--store-results
dbt_test__audit
. Normally, this means the schema ismy_target_schema_dbt_test__audit
; if someone is usinggenerate_schema_name_for_env
, it will be justdbt_test__audit
.tests: +schema
config indbt_project.yml
. So I feel less bad about hard coding it.Other comments
relational()
, which is the same asrefable()
but includes tests IFFflags.STORE_RESULTS
is turned on. dbt now usesrelational()
to decide which nodes inform the schemas it will create and when it will write therelation_name
property to the manifest.count(*) = 0
? It's possible that a passing test may still have rows in the database that a user would want to inspectrun_results.json
artifact.Checklist
CHANGELOG.md
and added information about my change to the "dbt next" section.