-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-7063] Fix dag.clear() slowness caused by count #7723
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. The test I added does not play well with sqlite. I'll fix soon |
The error is really interesting 🤔 |
The test failed in sqlite because of a hard limit on the level of nesting in the query. The test is specifically made to cause many levels of nesting in the generated sql to reproduce the sqlalchemy slowness. Stackoverflow suggests there are compiler options to increase this hard limit in sqlite, but I doubt anyone uses sqlite in production with a dag this complicated. I think in this case skipping the test in sqlite backend is the right thing to do so I did that with |
Indeed. Interesting one :) |
Codecov Report
@@ Coverage Diff @@
## master #7723 +/- ##
==========================================
- Coverage 86.92% 86.15% -0.77%
==========================================
Files 915 915
Lines 44152 44152
==========================================
- Hits 38377 38041 -336
- Misses 5775 6111 +336
Continue to review full report at Codecov.
|
Thanks @yuqian90 ! |
(cherry picked from commit 6fc5148)
(cherry picked from commit 6fc5148)
(cherry picked from commit 6fc5148)
Calling
tis.count()
makesdag.clear()
much slower than just retrieving all thetis
and calllen()
because of some issues with how sqlalchemy generate the sql for count() when the query has many UNION statements. See the comments on the JIRA for detailed performance timing.Issue link: AIRFLOW-7063
Make sure to mark the boxes below before creating PR: [x]
[AIRFLOW-NNNN]
. AIRFLOW-NNNN = JIRA ID** For document-only changes commit message can start with
[AIRFLOW-XXXX]
.In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.