You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think that makes sense 90% of the time in the general case, but we want to give users the ability to customize the "failure calculation" if they'd like it to be something other than count(*). This is important for backward compatibility, since schema tests could previously calculate and return whatever numeric value they wanted. In the wild, this could be as simple as sum(column) instead of count(*), or it could be as complex as the dbt_utils.equality test:
selectcount(*) from unioned) +
(select abs(
(selectcount(*) from a_minus_b) -
(selectcount(*) from b_minus_a)
)
I'm hopeful this is quite straightforward to implement—it's just a matter of pulling in the fail_calc and templating it into the materialization.
Questions
Should fail_calc be a test config or a test property? I lean toward property, since I think this is an essential component of the test definition and less like something that wants to be set for many different types of tests at once, e.g. from dbt_project.yml. (In a post-Set configs in schema.yml files #2401 world, this is hopefully a less meaningful distinction!)
Could this have potentially strange interactions with % values of warn_if / error_if (Net-new test configs #3258)? Yes! I don't think we need to solve for every edge case there now.
The text was updated successfully, but these errors were encountered:
Update: We're going to make this a test config, following the pattern sketched out in #3258.
I don't think it makes a ton of sense for users to override the default configs set by the generic test definition, but hey, they'll be able to if they want to. I could even see this being compelling—let's say you want the unique test to calculate failure as the number of original rows containing a duplicate, rather than the length of the set of duplicate values.
Describe the feature
Today, the
'test'
materialization hard-codescount(*)
as the way to calculate failures from a test:https://github.com/fishtown-analytics/dbt/blob/26fb58bd1b08218781b182918f5ed4dec3f735d9/core/dbt/include/global_project/macros/materializations/test.sql#L4-L7
I think that makes sense 90% of the time in the general case, but we want to give users the ability to customize the "failure calculation" if they'd like it to be something other than
count(*)
. This is important for backward compatibility, since schema tests could previously calculate and return whatever numeric value they wanted. In the wild, this could be as simple assum(column)
instead ofcount(*)
, or it could be as complex as thedbt_utils.equality
test:I'm hopeful this is quite straightforward to implement—it's just a matter of pulling in the
fail_calc
and templating it into the materialization.Questions
fail_calc
be a test config or a test property? I lean toward property, since I think this is an essential component of the test definition and less like something that wants to be set for many different types of tests at once, e.g. fromdbt_project.yml
. (In a post-Set configs in schema.yml files #2401 world, this is hopefully a less meaningful distinction!)%
values ofwarn_if
/error_if
(Net-new test configs #3258)? Yes! I don't think we need to solve for every edge case there now.The text was updated successfully, but these errors were encountered: