-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow ExternalTaskSensor to wait for taskgroup #14640
Conversation
airflow/sensors/external_task.py
Outdated
.scalar() | ||
) | ||
) / len(external_task_group_task_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to keep the poking check behavior return count_allowed == len(dttm_filter)
@@ -134,20 +146,23 @@ def __init__( | |||
self.execution_delta = execution_delta | |||
self.execution_date_fn = execution_date_fn | |||
self.external_dag_id = external_dag_id | |||
self.external_task_group_id = external_task_group_id | |||
self.external_task_id = external_task_id | |||
self.check_existence = check_existence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.check_existence = check_existence
is False
by default, which maybe make sense for external_dag or external_task. But external_task_group has to check and get an existing dag in order to get the list of task_ids.
I wonder if we can change the default to True or even have check_existence
enabled required? This can give more useful errors if the external task/dag does not exist as well as having a consistent behavior as external_task_group. Also, what would be use case to have a Sensor waiting for an object that doesn't exist until it times out?
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
d620a7a
to
e2dcab4
Compare
e2dcab4
to
77c9eb6
Compare
airflow/sensors/external_task.py
Outdated
@@ -164,18 +184,23 @@ def poke(self, context, session=None): | |||
if self.failed_states: | |||
count_failed = self.get_count(dttm_filter, session, self.failed_states) | |||
|
|||
if count_failed == len(dttm_filter): | |||
if count_failed > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I am making the assumption that as long as there is at least one external task failure, then we will want to fail the sensor. Though this changes the original behavior, I think this will be a better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add this comment in code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary more than an entry in the UPDATING.md
. I think the only situation where you will have multiple counts is when the execution_date_fn
returns more than one execution date to wait for. However, the original behavior will get you into a weird state when only part of the TIs fail, i.e. one fail and one succeeds, resulting in time out. IMHO, I think this's more like a bug than intended behavior. WDYT?
def get_external_task_group_task_ids(self, session): | ||
"""Return task ids for the external TaskGroup""" | ||
refreshed_dag_info = DagBag(read_dags_from_db=True).get_dag(self.external_dag_id, session) | ||
task_group: Optional["TaskGroup"] = refreshed_dag_info.task_group_dict.get( | ||
self.external_task_group_id | ||
) | ||
if not task_group: | ||
raise AirflowException( | ||
f"The external task group {self.external_task_group_id} in " | ||
f"DAG {self.external_dag_id} does not exist." | ||
) | ||
task_ids = [task.task_id for task in task_group] | ||
return task_ids | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main piece that you retrieve a list of tasks for a TaskGroup. I believe that read_dags_from_db=True
is safe to use here because serialized dag is enabled by default in 2.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing task execution code is creating DagBag on its own instead of reading serialized dags from db. For example this line is creating a DagBag. I think we should do the same here. It's important for tasks to get the latest view of the dag during execution.
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
9d27356
to
914bf4e
Compare
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
4e72ea9
to
fce4940
Compare
Test fails on K8S image build job, and I think it's not relevant to this PR |
Just fixed the K8S problem in #15182 - can you please rebase. |
fce4940
to
1d0bf9e
Compare
2e92f65
to
21a2d07
Compare
Co-authored-by: Kaxil Naik <[email protected]>
fixup! Test external task group sensor fixup! fixup! Test external task group sensor fixup! fixup! fixup! Test external task group sensor fixup! fixup! fixup! fixup! Test external task group sensor
f1edc22
to
c3495aa
Compare
Co-authored-by: Tomek Urbaszek <[email protected]>
Can you fix the conflicts please @xinbinhuang |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
@xinbinhuang will you have time to complete it? |
Thanks for the nudge! Will try to wrap it up before the holidays hit. |
Great :) re-opening so it won't be missed |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
Darn, it's a shame this didn't get through - it is exactly what I was looking for! |
Feel free to open a PR and contribute it on your own. You willl just have to make sure to follow it up and implement it to the quality that we expect @russellpierce - Airflow is created by > 2000 contributors like you, so if you need something, implementing it yourself is the fastest way to get things done @russellpierce. |
closes: #14563
This PR enables ExternalTaskSensor to also wait for the external task_group.
The implementation is to retrieve the external DAG from the DagBag and then check if the TaskGroup exists. If so, query and wait for the states of all tasks within that TaskGroup during the poking cycle.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.