Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-5417] Fix DB disconnects during webserver startup #6023

Merged
merged 1 commit into from
Sep 10, 2019

Conversation

KevinYang21
Copy link
Member

@KevinYang21 KevinYang21 commented Sep 6, 2019

Make sure you have checked all steps below.

Jira

Description

  • Here are some details about my PR, including screenshots of any UI changes:
    The current workflow of creating an app in views.py is:
  1. Create one DB session
  2. Assign it to flask-appbuilder, use the session to help initializing the app.
  3. Create DagBag
  4. Reuse the same session to add permission using flask-appbuilder etc.
    • Reason for us reusing the session is because we use the scoped_session w/o a scope function, thus made the session a thread-local scope session, which will be reused until it is removed. Unfortunately flask-appbuilder doesn't remove the session after a query, e.g. 1.

Problem with this is that if during step 3, there's a DB disconnect, step 4 will fail because the session is still attached with the disconnected DB connection and not rolled back. It becomes a bigger problem if step 3 regularly takes a long time, more than the DB connection expire time, step 4 will always fail, which is the case for us currently.

Tried to move the import of views.py to the outer scope but that seem to fail multiple tests. The approach here is to remove the session from the registry and the next session call will just create a new one automatically.

Of course the proper way is to fix the session usage in flask-appbuilder but that would be a longer term fix.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Not really changing the behavior.

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

@potiuk
Copy link
Member

potiuk commented Sep 8, 2019

Very interesting one. I read a bit about scoped session in this context.

Just a thought:

Maybe a better solution (still temporary but a bit less invasive) will be to run init_views in a separate thread and join it later (if we need to join it at all). This way the session will not be removed from the thread_local scope and in case it is used later on it might be still used?

@KevinYang21
Copy link
Member Author

@potiuk thanks for reviewing and the idea. Actually I think the cost is actually less if we just remove the session. We have to recreate the connection but either way we'll have to use two connections to finish all the init steps. Removing a session from a scoped session is very safe but I'm not that sure about the safeness of parallelly initializing the app.

@potiuk
Copy link
Member

potiuk commented Sep 9, 2019

Fair :). It's LGTM for me but maybe someone with more insight of Flask Can also take a look.

@KevinYang21
Copy link
Member Author

TY! I'll keep it open for two more days then.

@feluelle feluelle added area:MetaDB Meta Database related issues. area:webserver Webserver related Issues labels Sep 9, 2019
@KevinYang21 KevinYang21 merged commit 205db4b into apache:master Sep 10, 2019
cong-zhu pushed a commit to cong-zhu/airflow that referenced this pull request Feb 6, 2020
cong-zhu pushed a commit to cong-zhu/airflow that referenced this pull request Feb 6, 2020
[AIRFLOW-5417] Fix DB disconnects during webserver startup (apache#6023) (apache#306)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:MetaDB Meta Database related issues. area:webserver Webserver related Issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants