-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add filter field to source table definitions (#1495) #1776
Add filter field to source table definitions (#1495) #1776
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This technically look very good to me - the mypy additions here are great, and I like the approach you implemented for making empty source freshness results appear as "stale".
My one holdup is that a top-level filter:
config on the source probably isn't the right way to specify this field. I'd expect a field like that to work globally, but it really only contributes to source freshness queries at the moment. In #1495, there's also a comment about how schema tests don't work properly for tables that require filters. I don't think we should try to address that problem right now - that's going to involve changing schema test logic, and it can be worked around in user-space.
Can we change the filter
field to live inside of the freshness:
block? I think that will assuage my concerns around the placement of the filter:
field and will better clarify to users what it actually does. I'll follow up in #1494 with some approaches (outside the scope of this PR) for adding filters in schema tests on BigQuery.
{% macro default__collect_freshness(source, loaded_at_field) %} | ||
{% call statement('check_schema_exists', fetch_result=True, auto_begin=False) -%} | ||
{% macro default__collect_freshness(source, loaded_at_field, filter) %} | ||
{% call statement('collect_freshness', fetch_result=True, auto_begin=False) -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Added some type annotations Clean up some mypy issues around the "available" decorators
489f631
to
55a6b9a
Compare
55a6b9a
to
0522535
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great - ship it!
Fixes #1495
Added a
filter
parameter to source table definitions. It defaults toNone
. If provided, awhere {{ filter }}
clause is added to the source freshness calculation select statement.I also added some tests, did some mypy-related cleanup. While writing the tests, I needed to define some sort of reasonable behavior for when the source freshness query returns no data - I decided that we'll treat your data as if the latest update was on January 1, year 1 at 00:00:00 UTC. Hopefully that's outside your source freshness range. We could alternatively raise an exception about there being no data, but I thought the idea of your data being considered very out of date was more coherent.