Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DatasetDagRunQueue to all the consuming DAGs of a dataset alias #41264

Conversation

Lee-W
Copy link
Member

@Lee-W Lee-W commented Aug 5, 2024

Why

Ever since #40693, we have been able to schedule a DAG based on DatasetAlias. When a dataset alias is resolved in a producer DAG for the first time, a consumer DAG that depends on that dataset alias will have to wait for the next round of DAG parsing to realize its dependency on the resolved datasets. Consequently, the consumer DAG will need to wait for the second run of the producer DAG to be triggered.

What

This PR created DDRQ for the consuming dags of dataset alias as well. So after the consumer DAG is updated after DAG parsing, it will have DDRQ which might triggers it.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added the area:db-migrations PRs with DB migration label Aug 5, 2024
@Lee-W Lee-W force-pushed the shift-dataset-aliases-in-dag-schedule-resolution-before-dataset-event-is-created branch 3 times, most recently from c32fc46 to c70f670 Compare August 6, 2024 02:55
@Lee-W Lee-W changed the title feat: create dag_schedule_dataset_alias_reference Add DatasetDagRunQueue to all the consuming DAGs of a dataset alias Aug 6, 2024
@Lee-W Lee-W force-pushed the shift-dataset-aliases-in-dag-schedule-resolution-before-dataset-event-is-created branch 2 times, most recently from 87bf7d3 to 2aa49d9 Compare August 6, 2024 07:37
@Lee-W Lee-W marked this pull request as ready for review August 6, 2024 07:37
@Lee-W Lee-W force-pushed the shift-dataset-aliases-in-dag-schedule-resolution-before-dataset-event-is-created branch 3 times, most recently from 1a21af9 to f3375a2 Compare August 6, 2024 09:46
@phanikumv phanikumv added this to the Airflow 2.10.0 milestone Aug 6, 2024
@Lee-W Lee-W force-pushed the shift-dataset-aliases-in-dag-schedule-resolution-before-dataset-event-is-created branch from bc9b269 to 5e27790 Compare August 6, 2024 13:13
@phanikumv phanikumv merged commit c8bc42c into apache:main Aug 7, 2024
81 checks passed
@phanikumv phanikumv deleted the shift-dataset-aliases-in-dag-schedule-resolution-before-dataset-event-is-created branch August 7, 2024 02:15
@ephraimbuddy ephraimbuddy added changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) type:new-feature Changelog: New Features and removed changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) labels Aug 9, 2024
molcay pushed a commit to VladaZakharova/airflow that referenced this pull request Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:db-migrations PRs with DB migration type:new-feature Changelog: New Features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants