Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not override default in spark hook with default in operator #8508

Merged
merged 1 commit into from
Apr 22, 2020
Merged

Do not override default in spark hook with default in operator #8508

merged 1 commit into from
Apr 22, 2020

Conversation

r-richmond
Copy link
Contributor

This re-implements the same change as #6680

My goal here is to get this change in the next 1.10.* release

Please let me know if there is a better way to do this


Make sure to mark the boxes below before creating PR: [x]


In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@Spasnof
Copy link

Spasnof commented Apr 22, 2020

To add a bit of background @r-richmond and I ran into this issue today upgrading from 1.10.2 to 1.10.10 and relying on a spark2-submit that was set in our connection to set the behavior for 150+ operator calls in our environment.

The "normal" order of operations for most operators (pun not intended) appears to be:

  1. Operator arg values (local to dag)
  2. Connection arg values (local to environment)
  3. Global parameter values (fallback)

In this case we have the connection_id and spark_binary args competing for the same value. And the operator will always win if set and it is set by default. So we are faced with a couple unpleasant options:

  • Thinly wrapping the implementation in our own operator
  • Setting the operator value in the 150+ occurrences.
  • Monkey patching and linking back to this pr.

I hope this offers some background into the dilemma of having connection vars overwritten by operator defaults.

@r-richmond
Copy link
Contributor Author

Failure looks like a dockerfile lint failure and isn't related to this change

@potiuk
Copy link
Member

potiuk commented Apr 22, 2020

Please rebase to latest master. As announced on the devlist/slack we fixed the problem with docker lint (resulted from release of hadolint with new tests). It's been fixed and pinned so that it won't happen again, but you need to rebase to latest master to get it fixed.

@r-richmond
Copy link
Contributor Author

This change is already present in master; it just wasn’t present on this branch or in the latest release. (Note this file has moved in master as well)

So I don’t think it makes sense to rebase on master; right?

@kaxil kaxil changed the base branch from v1-10-stable to v1-10-test April 22, 2020 11:40
@kaxil kaxil merged commit c9c475f into apache:v1-10-test Apr 22, 2020
@kaxil kaxil added this to the Airflow 1.10.11 milestone Apr 22, 2020
cfei18 pushed a commit to cfei18/incubator-airflow that referenced this pull request Mar 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants