-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix future DagRun rarely triggered by race conditions when max_active_runs reached its upper limit. #31414
Conversation
Static check failure seems to be unrelated (also failing on main). |
can we have a unit test for this change to avoid regression? |
In this PR I have changed the dag retrieval to be done with locking. I believe the regression test is covered by the existing tests. |
It has been a while since our last conversation, |
@doiken Just picking up the thread here, your PR still has some failed static tests. |
@nathadfield |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually we don't merge changes without test, but that one would be rather difficult to write test for - LGTM
…_runs reached its upper limit. (#31414) * feat: select dag_model with row lock * fix: logging that scheduling was skipped * fix: remove unused get_dagmodel * fix: correct log message to more generic word --------- Co-authored-by: doiken <[email protected]> Co-authored-by: Tzu-ping Chung <[email protected]> Co-authored-by: eladkal <[email protected]> (cherry picked from commit b53e2ae)
Hello all, I seem to be running into a new deadlock issue due to this change. I've changed my dag name to DAGNAME in output below:
It seems that the scheduler fails to acquire the Dag Record lock, and then fails the entire dag as a result. Is someone able to explain the cases under which the scheduler would fail to acquire the Dag Record lock, and if there is some configuration I can use to mitigate? I was not seeing such failures before upgrading from 2.5.1 to 2.7.2 |
…x_active_runs reached its upper limit. (apache#31414)" This reverts commit b53e2ae.
…x_active_runs reached its upper limit. (apache#31414)" This reverts commit b53e2ae.
…x_active_runs reached its upper limit. (apache#31414)" (apache#37596) This reverts commit b53e2ae.
closes: #31407
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.