-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relationships tests with nulls (#887) #921
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me @beckjake!
Sorry for sending this over after you did all the work, but I wanted to bring these two PRs to your attention:
It seems that using a left outer join instead of a where not in (...)
is way more performant on most databases.
There's also a change in #924 that's relevant here. Would you rather merge this PR, then revisit those other topics? Or just do them all in a single PR?
|
||
# all of these constraints will fail | ||
- name: table_failure_null_relation | ||
description: "A table with a null value where it should be a foreign key" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
table_failure_null_relation
doesn't actually contain a null value though, does it? It looks like it just contains one row:
105 | count(*)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's correct. The target relation (table_failure_copy
) has a null
in that column, and there is no id 105
in table_failure_copy
, so before this change the test incorrectly passed. Now the test fails as it should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, perfect!
@drewbanin I can do them all here I guess, I didn't want to get this PR too busy but I guess that's better than 3 PRs. |
and {{ from }} not in (select {{ column_name }} | ||
from {{ to }}) | ||
|
||
model.{{ column_name }} as id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to avoid namespacing these fields if possible. If an expression is provided for the field
or from
arguments, then this code won't work as expected. I do think we'll need to account for the fact that field
and from
could be ambiguously named though.
One way of doing that could be:
select count(*)
from (
select {{ column_name }} as id from {{ model }}
) as child
left join (
select {{ field }} as id from {{ to }}
) as parent on parent.id = child.id
where child.id is not null
and parent.id is null
I just gave this a quick check against our snowplow datasets and the explain plans for the two query structures are equivalent!
In general, I don't think that many people are using expressions in their relationships
tests, but I'd like to either be really forward about not supporting that pattern anymore, or just continuing to support it since it's a tiny tweak
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds great to me. I will certainly defer to your SQL expertise on all this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine to me. i defer to @drewbanin on the correct SQL to use for this test macro
…source so we fail next time this gets messed up
Co-authored-by: Michael Dunn <[email protected]> Co-authored-by: Martin Lue <[email protected]>
c071588
to
6222829
Compare
Fix #887 by adding the suggested 'where' clause, and add some tests for it in both v1 and v2 schemas.