Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-5343] Remove legacy way of pessimistic disconnect handling #6034

Merged
merged 1 commit into from
Sep 17, 2019
Merged

[AIRFLOW-5343] Remove legacy way of pessimistic disconnect handling #6034

merged 1 commit into from
Sep 17, 2019

Conversation

Khrol
Copy link
Contributor

@Khrol Khrol commented Sep 6, 2019

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"
    • https://issues.apache.org/jira/browse/AIRFLOW-5343
    • In case you are fixing a typo in the documentation you can prepend your commit with [AIRFLOW-XXX], code changes always need a Jira issue.
    • In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal (AIP).
    • In case you are adding a dependency, check if the license complies with the ASF 3rd Party License Policy.

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Based on discussions in #5949 it was figured out that there is already pessimistic disconnect timeout handling. So instead of hand-written one only SQLAlchemy embedded way should be used.

'sqlalchemy~=1.3' is in setup.py requirements and pool_pre_ping appeared in SQLAlchemy 1.2.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

It's quite an edge case for integrated environment. I'm not quite sure how to test it.

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release


# How many seconds to retry re-establishing a DB connection after
# disconnects. Setting this to 0 disables retries.
sql_alchemy_reconnect_timeout = 300
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens now with disconnects/errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If connection from the pool is not healthy, it's removed from the pool and new one is created (with three attempts). Other connections are invalidated from the pool as well.

sqlalchemy/sqlalchemy@f881dae#diff-31816cdb15e64b0af1b862f51abe1226R920 - the test visualizes this.

This commit from SQLAlchemy is quite a good explanation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pool_pre_ping argument should take care of it and the code for re connect is within sql alchemy library. We don't have to handle explicitly with new sql alchemy version

Based on discussions in #5949
it was figured out that there is already pessimistic disconnect
timeout handling. So instead of hand-written one only SQLAlchemy
embedded way should be used.

'sqlalchemy~=1.3' is in `setup.py` requirements and `pool_pre_ping`
appeared in SQLAlchemy 1.2.
@Khrol Khrol marked this pull request as ready for review September 9, 2019 09:06
@codecov-io
Copy link

Codecov Report

Merging #6034 into master will decrease coverage by 1.31%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6034      +/-   ##
==========================================
- Coverage   80.03%   78.72%   -1.32%     
==========================================
  Files         594      594              
  Lines       34748    35248     +500     
==========================================
- Hits        27812    27749      -63     
- Misses       6936     7499     +563
Impacted Files Coverage Δ
airflow/settings.py 88.32% <100%> (-0.09%) ⬇️
airflow/utils/sqlalchemy.py 86.44% <100%> (+7.37%) ⬆️
airflow/executors/sequential_executor.py 47.61% <0%> (-52.39%) ⬇️
airflow/gcp/example_dags/example_bigquery.py 60.49% <0%> (-39.51%) ⬇️
airflow/contrib/operators/bigquery_operator.py 68.5% <0%> (-25.23%) ⬇️
airflow/contrib/hooks/bigquery_hook.py 45.97% <0%> (-24.77%) ⬇️
airflow/contrib/operators/bigquery_to_bigquery.py 70% <0%> (-23.34%) ⬇️
airflow/contrib/operators/bigquery_get_data.py 60.78% <0%> (-23%) ⬇️
airflow/contrib/operators/bigquery_to_gcs.py 70.73% <0%> (-22.82%) ⬇️
...ontrib/operators/bigquery_table_delete_operator.py 69.69% <0%> (-22.31%) ⬇️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cfd6022...8f63d25. Read the comment docs.

@Khrol
Copy link
Contributor Author

Khrol commented Sep 9, 2019

Codecov report is confusing in terms of coverage. There are mainly code deletions here.

@feluelle feluelle added the area:MetaDB Meta Database related issues. label Sep 9, 2019
@ashb ashb merged commit bc82607 into apache:master Sep 17, 2019
ashb pushed a commit to ashb/airflow that referenced this pull request Oct 14, 2019
…pache#6034)

Based on discussions in apache#5949
it was figured out that there is already pessimistic disconnect
timeout handling. So instead of hand-written one only SQLAlchemy
embedded way should be used.

'sqlalchemy~=1.3' is in `setup.py` requirements and `pool_pre_ping`
appeared in SQLAlchemy 1.2.

(cherry picked from commit bc82607)
adityav pushed a commit to adityav/airflow that referenced this pull request Oct 14, 2019
…pache#6034)

Based on discussions in apache#5949
it was figured out that there is already pessimistic disconnect
timeout handling. So instead of hand-written one only SQLAlchemy
embedded way should be used.

'sqlalchemy~=1.3' is in `setup.py` requirements and `pool_pre_ping`
appeared in SQLAlchemy 1.2.

(cherry picked from commit bc82607)
@Khrol Khrol deleted the AIRFLOW-5343_pool_pre_ping2 branch September 1, 2020 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:MetaDB Meta Database related issues.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants