Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-4763] Allow list in DockerOperator.command #5408

Merged
merged 2 commits into from
Jul 19, 2019
Merged

Conversation

akki
Copy link
Contributor

@akki akki commented Jun 12, 2019

https://issues.apache.org/jira/browse/AIRFLOW-4763 - in case the issue seems valid to the maintainers, here is the fix.

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"
    • https://issues.apache.org/jira/browse/AIRFLOW-XXX
    • In case you are fixing a typo in the documentation you can prepend your commit with [AIRFLOW-XXX], code changes always need a Jira issue.
    • In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal (AIP).
    • In case you are adding a dependency, check if the license complies with the ASF 3rd Party License Policy.

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

Code Quality

  • Passes flake8

@codecov-io
Copy link

codecov-io commented Jun 12, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@979f452). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #5408   +/-   ##
=========================================
  Coverage          ?   79.13%           
=========================================
  Files             ?      488           
  Lines             ?    30608           
  Branches          ?        0           
=========================================
  Hits              ?    24222           
  Misses            ?     6386           
  Partials          ?        0
Impacted Files Coverage Δ
airflow/operators/docker_operator.py 96.59% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 979f452...ba0a1c2. Read the comment docs.

Copy link
Member

@feluelle feluelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of..

def get_command(self):
        if self.command is not None and isinstance(self.command, str) and self.command.strip().find('[') == 0:
            commands = ast.literal_eval(self.command)
        else:
            commands = self.command

Also see this why I used isinstance instead of type.

@akki
Copy link
Contributor Author

akki commented Jun 23, 2019

@feluelle Ahh, good point - just realised the redundancy. But I also see that the self.command is not None part is redundant as well in the newly suggested implemented self.command is not None and isinstance(self.command, str) and self.command.strip().find('[') == 0, so I am finally changing this to:

def get_command(self):
        if isinstance(self.command, str) and self.command.strip().find('[') == 0:
            commands = ast.literal_eval(self.command)
        else:
            commands = self.command

Please have a look and let me know if you see any concerns. Thanks for the review. :)

Copy link
Member

@feluelle feluelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@akki
Copy link
Contributor Author

akki commented Jun 26, 2019

@feluelle Thanks for the approval. I just wanted to confirm that this PR is not waiting for any action from my side.
This is my first commit to Airflow so can you please let me know if I have missed some step - sorry for the delay.

@feluelle
Copy link
Member

Hey @akki you didn't miss any steps. It just needs to be reviewed by a committer. Maybe @kaxil or @mik-laj has some time to review it.

@@ -247,7 +247,7 @@ def execute(self, context):
if self.xcom_all else line.encode('utf-8')

def get_command(self):
if self.command is not None and self.command.strip().find('[') == 0:
if isinstance(self.command, str) and self.command.strip().find('[') == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have this and self.command.strip().find('[') == 0?

Just trying to understand the logic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Jinja returns the "[" sign, an array is created. Please look at related PR: https://github.com//pull/4900

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jinja would convert a list to list. self.command.strip().find('[') == 0 is checking a string, right? and checking if the first character is [.

Sorry, maybe I am missing something. @mik-laj Feel free to merge this PR :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a string is encountered, but the string looks like it contains arrays (starts with the character "[") it is processed by literal_parse, which creates the real table from the string.

>>> import ast
>>> ast.literal_eval('["A", "B"]')
['A', 'B']
>>>

Currently, Jinja always accepts a string at the input and returns a string. However, the mentioned PR discussed an alternative solution to the one used in this PR.

I can not do it at the moment. I just wanted to answer your question.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaxil @mik-laj So would you like to see any changes in this PR or does it have your approval?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @mik-laj for the explanation. I understand it now.

@akki
Copy link
Contributor Author

akki commented Jul 9, 2019

Hi, does anyone have any suggestion for this PR?
Else, would any of the committers approve/merge it?

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kaxil kaxil merged commit 2b94600 into apache:master Jul 19, 2019
@kaxil
Copy link
Member

kaxil commented Jul 19, 2019

Thank you @akki for your contribution and apologies for having to wait this long.

wmorris75 pushed a commit to modmed/incubator-airflow that referenced this pull request Jul 29, 2019
@chbrown
Copy link

chbrown commented Aug 20, 2019

Ran into this bug on my very first custom DAG after going through the tutorials 😣 thanks @akki 👏

FWIW here's my monkeypatch workaround (until the better fix in this PR is released... 1.10.5?):

from functools import partialmethod
from airflow.operators.docker_operator import DockerOperator
def get_command(operator):
    return operator.command  # assumes command is already a proper list
DockerOperator.get_command = partialmethod(get_command)

kaxil pushed a commit that referenced this pull request Aug 20, 2019
@akki akki deleted the patch-2 branch August 21, 2019 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants