Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-5136] Fix Bug with Incorrect template_fields in DataProc{*} … #5751

Merged
merged 1 commit into from
Aug 8, 2019

Conversation

kaxil
Copy link
Member

@kaxil kaxil commented Aug 7, 2019

…Operators

Make sure you have checked all steps below.

Jira

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Looks like a bad cherry-pick: a87310b

Traceback (most recent call last):
File "/usr/local/virtualenv/airflow/local/lib/python2.7/site-packages/airflow/models/dagbag.py", line 389, in collect_dags safe_mode=safe_mode)
File "/usr/local/virtualenv/airflow/local/lib/python2.7/site-packages/airflow/models/dagbag.py", line 253, in process_file self.bag_dag(dag, parent_dag=dag, root_dag=dag)
File "/usr/local/virtualenv/airflow/local/lib/python2.7/site-packages/airflow/models/dagbag.py", line 339, in bag_dag self.bag_dag(subdag, parent_dag=dag, root_dag=root_dag)
File "/usr/local/virtualenv/airflow/local/lib/python2.7/site-packages/airflow/models/dagbag.py", line 326, in bag_dag dag.resolve_template_files()
File "/usr/local/virtualenv/airflow/local/lib/python2.7/site-packages/airflow/models/dag.py", line 706, in resolve_template_files t.resolve_template_files()
File "/usr/local/virtualenv/airflow/local/lib/python2.7/site-packages/airflow/models/baseoperator.py", line 699, in resolve_template_files content = getattr(self, attr)
AttributeError: 'DataProcPySparkOperator' object has no attribute 'dataproc_pyspark_jars'

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

Code Quality

  • Passes flake8

@kaxil kaxil requested a review from potiuk August 7, 2019 16:11
@kaxil
Copy link
Member Author

kaxil commented Aug 7, 2019

Targeted towards 1.10-test branch

@OmerJog
Copy link
Contributor

OmerJog commented Aug 7, 2019

@kaxil I think we are missing a test like this:
https://github.com/apache/airflow/blob/master/tests/contrib/operators/test_gcp_compute_operator.py#L71-L93
to confirm that templated fields are working and actual templating is being done
Such test would have prevented the bad cherry picked doesn't it?

@potiuk
Copy link
Member

potiuk commented Aug 7, 2019

Yeah. @OmerJog is right. We will add tests for that for all GCP operators. We are in the process of unifying all GCP operators so I added https://issues.apache.org/jira/browse/AIRFLOW-5137 to cover that.

@ryanyuan
Copy link
Contributor

ryanyuan commented Aug 7, 2019

@kaxil @potiuk @ashb Any ideas on why the changes in dataproc_operator.py in my GCP DLP commit?

@potiuk
Copy link
Member

potiuk commented Aug 8, 2019

@ryanyuan - what do you mean ?

@ryanyuan
Copy link
Contributor

ryanyuan commented Aug 8, 2019

@ryanyuan - what do you mean ?

@potiuk Forgot to paste a reference. Please see the following code.

https://github.com/apache/airflow/blame/222c6ac45de54ed0398645c1d456a592e194325b/airflow/contrib/operators/dataproc_operator.py#L20

A lot of changes to that file were put into my DLP commit.

@potiuk
Copy link
Member

potiuk commented Aug 8, 2019

@ryanyuan From what I see it's quite OK. 222c6ac is a cherry-pick from master to v1-10-test branch. It contains the same changes as the original commit 6ef0e37 but applied to v1-10-test branch (some conflicts were resolved). In case you are not aware - the v1-10-test branch is completely separated from master - the current process we have with it that we select commits that we want to merge to 1.10.* and cherry-pick them to that branch. Those two branches are hundreds of commits different (but similar enough that cherry-picks were usually easy so far). What you likely see is the result of diff of your "master" with the v1-10-test branch. I hope it explains it :)

@potiuk
Copy link
Member

potiuk commented Aug 8, 2019

And this commit here is against v1-10-test branch directly to fix the problems with bad merge - where templated field names changed :(

@kaxil kaxil merged commit 4be73e0 into apache:v1-10-test Aug 8, 2019
@kaxil
Copy link
Member Author

kaxil commented Aug 8, 2019

@OmerJog - Yes you are right. I am merging this one as it is as now we have an issue open for adding test to all operators for templated fields which would be target to Master + Cherrypicked to v1-10-test

@ryanyuan - Jarek explained it well. It is just a case of bad cherry-pick that happens as both branches are diverging. We should add tests that cover it like Omer mentioned.

@kaxil kaxil deleted the fix-dataproc-issue branch August 8, 2019 13:17
@ryanyuan
Copy link
Contributor

ryanyuan commented Aug 9, 2019

@potiuk That makes sense. Thanks for the explanation!

@ghost
Copy link

ghost commented Aug 27, 2019

@kaxil , how soon will this fix make it to a 1.10.x release? Will it have to wait until 1.10.5 is released? Currently dataproc integration is broken in 1.10.4.

@kaxil
Copy link
Member Author

kaxil commented Aug 27, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants