Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-3632] Allow replace_microseconds in trigger_dag REST request #6380

Merged
merged 5 commits into from
Nov 20, 2019

Conversation

acroos
Copy link
Contributor

@acroos acroos commented Oct 21, 2019

Make sure you have checked all steps below.

Jira

Description

Reads the value for replace_microseconds from the API's trigger_dag endpoint. This will allow the endpoint to trigger > 1 dag per second.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
  • TestApiExperimental.test_trigger_dag_for_date_with_replace_microseconds_false

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

@acroos acroos force-pushed the master branch 2 times, most recently from 666ab50 to 48e1602 Compare October 22, 2019 07:07
@codecov-io
Copy link

codecov-io commented Oct 22, 2019

Codecov Report

Merging #6380 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6380      +/-   ##
==========================================
- Coverage   83.83%   83.82%   -0.01%     
==========================================
  Files         651      651              
  Lines       37431    37437       +6     
==========================================
+ Hits        31379    31383       +4     
- Misses       6052     6054       +2
Impacted Files Coverage Δ
airflow/utils/strings.py 100% <100%> (ø) ⬆️
airflow/www/api/experimental/endpoints.py 88.48% <100%> (+0.24%) ⬆️
airflow/utils/dag_processing.py 58.15% <0%> (-0.33%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update baae140...1d6766e. Read the comment docs.

@acroos acroos force-pushed the master branch 3 times, most recently from 33354af to 608c57a Compare October 24, 2019 07:11
@acroos
Copy link
Contributor Author

acroos commented Oct 24, 2019

@ashb is it possible to re-run a step in Travis? the test that's failing on only one of the environments now doesn't seem to have anything to do with these changes

@ashb
Copy link
Member

ashb commented Oct 24, 2019

@acroos Re-run that stage. (It needs someone with write access to the repo to hit the button)

@@ -107,7 +107,7 @@ def trigger_dag(
:param run_id: ID of the dag_run
:param conf: configuration
:param execution_date: date of execution
:param replace_microseconds: whether microseconds should be zeroed
:param replace_microseconds: whether microseconds should be zeroed (ignored if execution_date is given)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it matters, but the view in airflow/www/api/experimental/endpoints.py doesn't pass through a replace_microseconds parameter, so there's no way of setting this to anything that the default True from the REST API

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's true, maybe would be nice to add that as well.

@acroos
Copy link
Contributor Author

acroos commented Oct 25, 2019

@ashb any further changes I should make? or anything else I need to make this merge-able?

@acroos
Copy link
Contributor Author

acroos commented Oct 30, 2019

@ashb sorry to bother you, just wanted to check if there's anything else I should do so this can be merged?

@ashb
Copy link
Member

ashb commented Oct 30, 2019

Ack sorry, no this is just waiting on me (or another commiter!) to find time to review it again.

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code I think looks good, but we need a note in UPDATING.md about this change.

Is it possible to get the old behaviour back by explicitly passing replace_microsecond=True in the API call? If not could you please add a way to get the old behaviour back just in case someone is depending upon it.

@acroos
Copy link
Contributor Author

acroos commented Nov 4, 2019

@ashb can I clarify the exact behavior that you're expecting?

As far as I can tell there are 6 scenarios:

microseconds not replaced

  1. User gives execution_date, does not give replace_microseconds
    => given execution date with microseconds not replaced
  2. User gives execution_date, gives replace_microseconds=False
    => given execution date with microseconds not replaced
  3. User does not give execution_date, gives replace_microseconds=False
    => timezone.utcnow() with microseconds not replaced

microseconds replaced

  1. User does not give execution_date, does not give replace_microseconds
    => timezone.utcnow() with microseconds replaced
  2. User gives execution_date, gives replace_microseconds=True
    => given execution date with microseconds replaced
  3. User does not give execution_date, gives replace_microseconds=True
    => timezone.utcnow() with microseconds replaced

Are all of these scenarios correct?

The only non-obvious situation I guess is that not passing any value for replace_microseconds results in different outputs depending on if the execution_date was given or not

@acroos acroos changed the title [AIRFLOW-3632] Allow replace_microseconds in trigger_dag REST reqeust [AIRFLOW-3632] Allow replace_microseconds in trigger_dag REST request Nov 4, 2019
@deshraj
Copy link

deshraj commented Nov 13, 2019

@ashb @acroos is there a timeline for this fix so that it can be included in the next release? Would really appreciate if can be done sooner. :)

@acroos
Copy link
Contributor Author

acroos commented Nov 13, 2019

@deshraj yeah sorry I've been sick for a little while now, I'll try to fix it ASAP when I can

@deshraj
Copy link

deshraj commented Nov 13, 2019

No worries. Take care. :) Happy to help if needed.

UPDATING.md Outdated
The default behavior was to strip the microseconds (and milliseconds, etc) off of all dag runs triggered by
by the experimental REST API. The default behavior will change when an explicit execution_date is
passed in the request body. It will also now be possible to have the execution_date generated, but
keep the microseconds by sending `replace_microseconds: false` in the request body. The default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
keep the microseconds by sending `replace_microseconds: false` in the request body. The default
keep the microseconds by sending `replace_microseconds=false` in the request body. The default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 fixed

UPDATING.md Outdated
by the experimental REST API. The default behavior will change when an explicit execution_date is
passed in the request body. It will also now be possible to have the execution_date generated, but
keep the microseconds by sending `replace_microseconds: false` in the request body. The default
behavior can be overridden by sending `replace_microseconds: true` along with an explicit execution_date
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
behavior can be overridden by sending `replace_microseconds: true` along with an explicit execution_date
behavior can be overridden by sending `replace_microseconds=true` along with an explicit execution_date

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 fixed

@acroos
Copy link
Contributor Author

acroos commented Nov 18, 2019

oh no... that was not a successful rebase

Austin Roos added 2 commits November 18, 2019 15:53
Enable the caller of the trigger_dag API endpoint to specify a
boolean value for replace_microseconds.  When false, this will
store microseconds in the database for the execution_id, thus allowing
more than 1 request per second to be processed by the REST API
No need to require a user to pass in replace_microseconds to the
request body; instead we should use exactly the date that is given.
We will still replace the microseconds on execution_date if none is
passed in (and the param is True, which is the default)
A user can now choose to pass the execution_date in but still have
the microseconds replaced by also sending replace_microseconds=true
in the request body
@acroos acroos force-pushed the master branch 2 times, most recently from 1e8452d to 3b3d9fc Compare November 18, 2019 16:20
UPDATING.md Outdated
keep the microseconds by sending `replace_microseconds=false` in the request body. The default
behavior can be overridden by sending `replace_microseconds=true` along with an explicit execution_date

### Additional arguments passed to BaseOperator cause an exception
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you've got some other changes included here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm apparently? I guess when rebasing somehow picked up extra commits

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

The only way that replace_microseconds should be accessible
is through the endpoints, so leave the testing there
@deshraj
Copy link

deshraj commented Nov 21, 2019

@ashb may I ask you about the release timeline for this feature?

@ashb
Copy link
Member

ashb commented Nov 22, 2019

@deshraj It will be in 1.10.7 which in ~1-2 months.

ashb pushed a commit to ashb/airflow that referenced this pull request Dec 18, 2019
… trigger_dag REST API (apache#6380)

No need to require a user to pass in replace_microseconds to the
request body; instead we should use exactly the date that is given.
We will still replace the microseconds on execution_date if none is
passed in (and the param is True, which is the default)

(cherry picked from commit a45a209)
ashb pushed a commit that referenced this pull request Dec 18, 2019
… trigger_dag REST API (#6380)

No need to require a user to pass in replace_microseconds to the
request body; instead we should use exactly the date that is given.
We will still replace the microseconds on execution_date if none is
passed in (and the param is True, which is the default)

(cherry picked from commit a45a209)
ashb pushed a commit that referenced this pull request Dec 19, 2019
… trigger_dag REST API (#6380)

No need to require a user to pass in replace_microseconds to the
request body; instead we should use exactly the date that is given.
We will still replace the microseconds on execution_date if none is
passed in (and the param is True, which is the default)

(cherry picked from commit a45a209)
kaxil pushed a commit that referenced this pull request Dec 19, 2019
… trigger_dag REST API (#6380)

No need to require a user to pass in replace_microseconds to the
request body; instead we should use exactly the date that is given.
We will still replace the microseconds on execution_date if none is
passed in (and the param is True, which is the default)

(cherry picked from commit a45a209)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants