Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-6014] - handle pods which are preempted and deleted by kuber… #6606

Merged
merged 6 commits into from
Mar 18, 2020

Conversation

atrbgithub
Copy link
Contributor

…netes but not restarted

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title.

Description

  • Here are some details about my PR, including screenshots of any UI changes:
    This PR addresses the issue of when a pod is Preempted during the creation phase and due to pods having the following in the spec restartPolicy: Never The pod is never restarted and ends up as a queued task within Airflow until the scheduler is restarted.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Unsure if it is possible to simulate this scenario.

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

@mik-laj mik-laj added the k8s label Nov 25, 2019
@dimberman dimberman self-requested a review December 12, 2019 00:02
@stale
Copy link

stale bot commented Jan 26, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 26, 2020
@stale stale bot closed this Feb 2, 2020
@atrbgithub
Copy link
Contributor Author

@dimberman would it be possible to take a look at this?

@kaxil kaxil reopened this Mar 1, 2020
@stale stale bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Mar 1, 2020
@kaxil kaxil added the pinned Protect from Stalebot auto closing label Mar 1, 2020
@kaxil
Copy link
Member

kaxil commented Mar 1, 2020

cc @dimberman

airflow/executors/kubernetes_executor.py Outdated Show resolved Hide resolved
@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Mar 16, 2020
Copy link
Contributor

@inytar inytar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are running a fork of Airflow with a similar fix, really hoping this gets merged on upstream!

@dimberman
Copy link
Contributor

LGTM. I put in a fix for a new namespace awareness and if tests pass I will merge. Apologies for the delay this one fell through the cracks :/

@codecov-io
Copy link

Codecov Report

Merging #6606 into master will decrease coverage by 0.25%.
The diff coverage is 89.85%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6606      +/-   ##
==========================================
- Coverage   86.96%   86.71%   -0.26%     
==========================================
  Files         915      915              
  Lines       44188    44201      +13     
==========================================
- Hits        38429    38328     -101     
- Misses       5759     5873     +114     
Impacted Files Coverage Δ
airflow/executors/kubernetes_executor.py 56.63% <0.00%> (-0.36%) ⬇️
airflow/models/taskinstance.py 94.70% <0.00%> (-0.28%) ⬇️
airflow/providers/google/cloud/hooks/base.py 96.27% <97.29%> (+0.37%) ⬆️
...rflow/providers/google/cloud/operators/bigquery.py 91.37% <100.00%> (-0.12%) ⬇️
...roviders/google/cloud/operators/gcs_to_bigquery.py 92.95% <100.00%> (+22.36%) ⬆️
...viders/google/cloud/operators/kubernetes_engine.py 97.18% <100.00%> (-0.08%) ⬇️
...oviders/google/cloud/utils/credentials_provider.py 91.83% <100.00%> (-1.39%) ⬇️
airflow/providers/qubole/operators/qubole.py 87.69% <100.00%> (-0.88%) ⬇️
airflow/utils/process_utils.py 76.53% <100.00%> (+3.27%) ⬆️
airflow/kubernetes/volume_mount.py 44.44% <0.00%> (-55.56%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a54512...4151877. Read the comment docs.

@potiuk potiuk merged commit 4e626be into apache:master Mar 18, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Mar 18, 2020

Awesome work, congrats on your first merged pull request!

kaxil pushed a commit that referenced this pull request Mar 19, 2020
kaxil pushed a commit that referenced this pull request Mar 19, 2020
kaxil pushed a commit that referenced this pull request Mar 19, 2020
kaxil pushed a commit that referenced this pull request Mar 22, 2020
kaxil pushed a commit that referenced this pull request Mar 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Scheduler including HA (high availability) scheduler pinned Protect from Stalebot auto closing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants