Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-4363] Fix JSON encoding error #7628

Merged
merged 1 commit into from
Mar 25, 2020

Conversation

retornam
Copy link
Contributor

@retornam retornam commented Mar 5, 2020

From the docker-py code comments for APIClient pull,
the decode parameter should be set to True, when the
stream parameter is also set to True. This will allow
decoding JSON data returned from the docker registry
server into dicts

Signed-off-by: Raymond Etornam [email protected]


Issue link: AIRFLOW-4363

Make sure to mark the boxes below before creating PR: [x]

  • [x ] Description above provides context of the change
  • Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
  • Unit tests coverage for changes (not needed for documentation changes)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

* For document-only changes commit message can start with [AIRFLOW-XXXX].


In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@retornam retornam force-pushed the retornam/AIRFLOW-4363 branch 5 times, most recently from c110a04 to e494cbf Compare March 6, 2020 17:55
From the docker-py code comments for APIClient pull,
the decode parameter should be set to True, when the
stream parameter is also set to True. This will allow
decoding JSON data returned from the docker registry
server into dicts

Signed-off-by: Raymond Etornam <[email protected]>
@dimberman dimberman merged commit 733d3d3 into apache:master Mar 25, 2020
kaxil pushed a commit that referenced this pull request Mar 25, 2020
From the docker-py code comments for APIClient pull,
the decode parameter should be set to True, when the
stream parameter is also set to True. This will allow
decoding JSON data returned from the docker registry
server into dicts

Signed-off-by: Raymond Etornam <[email protected]>
(cherry picked from commit 733d3d3)
kaxil pushed a commit to astronomer/airflow that referenced this pull request Mar 30, 2020
From the docker-py code comments for APIClient pull,
the decode parameter should be set to True, when the
stream parameter is also set to True. This will allow
decoding JSON data returned from the docker registry
server into dicts

Signed-off-by: Raymond Etornam <[email protected]>
(cherry picked from commit 733d3d3)
kaxil pushed a commit that referenced this pull request Mar 30, 2020
From the docker-py code comments for APIClient pull,
the decode parameter should be set to True, when the
stream parameter is also set to True. This will allow
decoding JSON data returned from the docker registry
server into dicts

Signed-off-by: Raymond Etornam <[email protected]>
(cherry picked from commit 733d3d3)
@brokenjacobs
Copy link
Contributor

brokenjacobs commented Apr 13, 2020

This breaks the pull functionality because the very next function in the loop tries to do a string decode on this dictionary.
Leading to:

    output = json.loads(l.decode('utf-8').strip())
AttributeError: 'dict' object has no attribute 'decode'```

Basically this entire PR breaks image pulls as written. 

@retornam retornam deleted the retornam/AIRFLOW-4363 branch April 13, 2020 23:48
@dimberman
Copy link
Contributor

Darn. @brokenjacobs do you have a suggestion for a fix for this?

@brokenjacobs
Copy link
Contributor

Testing a fix right now.

@retornam
Copy link
Contributor Author

#8287

@brokenjacobs
Copy link
Contributor

remove the entire json.loads() line... change the loop:
for output in self.cli.pull(self.image, stream=True, decode=True):
That fixes it.

@hagope
Copy link
Contributor

hagope commented Apr 20, 2020

After upgrading from 1.10.9 to 1.10.10, you may see the following error:

[2020-04-20 23:14:44,087] {{docker_operator.py:267}} INFO - Pulling docker image XXXX:latest
[2020-04-20 23:14:44,140] {{taskinstance.py:1145}} ERROR - 'dict' object has no attribute 'decode'
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 983, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python3.7/site-packages/airflow/operators/docker_operator.py", line 269, in execute
    output = json.loads(l.decode('utf-8').strip())
AttributeError: 'dict' object has no attribute 'decode'

You can patch this in your Dockerfile by adding the following (per @brokenjacobs above):

RUN sed -i '268s/l/output/' /usr/local/lib/python3.7/site-packages/airflow/operators/docker_operator.py
RUN sed -i '269s/output/#output/' /usr/local/lib/python3.7/site-packages/airflow/operators/docker_operator.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants