Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-3885] ~2.5x speed-up for backfill tests #4731

Merged

Conversation

astahlman
Copy link
Contributor

Jira

  • [ X ] My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"

Description

  • [ X ] Here are some details about my PR, including screenshots of any UI changes:

The BackfillJobTest suite now takes 57 seconds vs. the baseline of 147
seconds on my laptop.

A couple of optimizations:

  • Don't sleep() if we are running unit tests
  • Don't backfill more DagRuns than needed (reduced from 5 to 2, since we
    only need 2 DagRuns to verify that we can run backwards)

I've also made a few tests reentrant by clearing out the Pool, DagRun,
and TaskInstance table between runs.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • [ X ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.
    • All the public functions and the classes in the PR contain docstrings that explain what it does

Code Quality

  • [ X ] Passes flake8

The BackfillJobTest suite now takes 57 seconds vs. the baseline of 147
seconds on my laptop.

A couple of optimizations:

- Don't sleep() if we are running unit tests
- Don't backfill more DagRuns than needed (reduced from 5 to 2, since we
  only need 2 DagRuns to verify that we can run backwards)

I've also made a few tests reentrant by clearing out the Pool, DagRun,
and TaskInstance table between runs.
@codecov-io
Copy link

codecov-io commented Feb 18, 2019

Codecov Report

Merging #4731 into master will increase coverage by <.01%.
The diff coverage is 28.57%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4731      +/-   ##
==========================================
+ Coverage   74.65%   74.66%   +<.01%     
==========================================
  Files         430      430              
  Lines       27991    27994       +3     
==========================================
+ Hits        20897    20901       +4     
+ Misses       7094     7093       -1
Impacted Files Coverage Δ
airflow/jobs.py 77.58% <28.57%> (+0.06%) ⬆️
airflow/contrib/operators/ssh_operator.py 83.33% <0%> (+1.28%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 725bc2e...fe84f9a. Read the comment docs.

@astahlman astahlman changed the title [WIP] [AIRFLOW-3885] ~2.5x speed-up for backfill tests [AIRFLOW-3885] ~2.5x speed-up for backfill tests Feb 19, 2019
@astahlman
Copy link
Contributor Author

cc @feng-tao

@feng-tao feng-tao merged commit 480eeff into apache:master Feb 19, 2019
antonimaciej pushed a commit to PolideaInternal/airflow that referenced this pull request Feb 26, 2019
The BackfillJobTest suite now takes 57 seconds vs. the baseline of 147
seconds on my laptop.

A couple of optimizations:

- Don't sleep() if we are running unit tests
- Don't backfill more DagRuns than needed (reduced from 5 to 2, since we
  only need 2 DagRuns to verify that we can run backwards)

I've also made a few tests reentrant by clearing out the Pool, DagRun,
and TaskInstance table between runs.
ashb pushed a commit to ashb/airflow that referenced this pull request Mar 6, 2019
The BackfillJobTest suite now takes 57 seconds vs. the baseline of 147
seconds on my laptop.

A couple of optimizations:

- Don't sleep() if we are running unit tests
- Don't backfill more DagRuns than needed (reduced from 5 to 2, since we
  only need 2 DagRuns to verify that we can run backwards)

I've also made a few tests reentrant by clearing out the Pool, DagRun,
and TaskInstance table between runs.
wmorris75 pushed a commit to modmed/incubator-airflow that referenced this pull request Jul 29, 2019
The BackfillJobTest suite now takes 57 seconds vs. the baseline of 147
seconds on my laptop.

A couple of optimizations:

- Don't sleep() if we are running unit tests
- Don't backfill more DagRuns than needed (reduced from 5 to 2, since we
  only need 2 DagRuns to verify that we can run backwards)

I've also made a few tests reentrant by clearing out the Pool, DagRun,
and TaskInstance table between runs.
@eschachar eschachar deleted the astahlman/airflow-3885-speedup-backfill-tests branch September 24, 2022 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants