Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-4438] Add Gzip compression to S3_hook #7680

Merged
merged 1 commit into from
Mar 10, 2020
Merged

Conversation

OmairK
Copy link
Contributor

@OmairK OmairK commented Mar 10, 2020

  • Added bool parameter gzip to load_file to s3_hook
  • Tested the load_file with load_file_gzip unittest
  • Updated the load_file docstring to reflect the extra parameter

Issue link: AIRFLOW-4438

Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Commit message/PR title starts with [AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID*
  • Unit tests coverage for changes (not needed for documentation changes)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

* For document-only changes commit message can start with [AIRFLOW-XXXX].


In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@boring-cyborg boring-cyborg bot added the provider:amazon-aws AWS/Amazon - related issues label Mar 10, 2020
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one nit-picking. If you correct it - we can merge it :)

@@ -442,6 +445,9 @@ def load_file(self,
:param encrypt: If True, the file will be encrypted on the server-side
by S3 and will be stored in an encrypted form while at rest in S3.
:type encrypt: bool
:param gzip: If True, the file will be compressed on the server-side
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:param gzip: If True, the file will be compressed on the server-side
:param gzip: If True, the file will be compressed locally

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I will resolve this

 - Added bool parameter gzip to load_file to s3_hook
 - Tested the load_file with load_file_gzip unittest
 - Updated the load_file docstring to reflect the extra parameter
@codecov-io
Copy link

Codecov Report

Merging #7680 into master will decrease coverage by 22.52%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #7680       +/-   ##
===========================================
- Coverage   86.99%   64.47%   -22.53%     
===========================================
  Files         904      903        -1     
  Lines       43728    43723        -5     
===========================================
- Hits        38043    28189     -9854     
- Misses       5685    15534     +9849
Impacted Files Coverage Δ
airflow/providers/amazon/aws/hooks/s3.py 96.46% <100%> (+0.12%) ⬆️
...low/contrib/operators/wasb_delete_blob_operator.py 0% <0%> (-100%) ⬇️
airflow/contrib/hooks/vertica_hook.py 0% <0%> (-100%) ⬇️
airflow/contrib/sensors/__init__.py 0% <0%> (-100%) ⬇️
airflow/hooks/mssql_hook.py 0% <0%> (-100%) ⬇️
...viders/docker/example_dags/example_docker_swarm.py 0% <0%> (-100%) ⬇️
airflow/hooks/webhdfs_hook.py 0% <0%> (-100%) ⬇️
airflow/contrib/sensors/emr_base_sensor.py 0% <0%> (-100%) ⬇️
...irflow/contrib/operators/slack_webhook_operator.py 0% <0%> (-100%) ⬇️
...providers/google/cloud/example_dags/example_dlp.py 0% <0%> (-100%) ⬇️
... and 487 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6af02f...89b6207. Read the comment docs.

@potiuk potiuk merged commit b7cdda1 into apache:master Mar 10, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Mar 10, 2020

Awesome work, congrats on your first merged pull request!

@potiuk
Copy link
Member

potiuk commented Mar 10, 2020

Thanks @OmairK !

@OmairK
Copy link
Contributor Author

OmairK commented Mar 10, 2020

Your welcome @potiuk 😄 . Thanks a lot for guiding me through my first PR.

@thierryturpin
Copy link

Hello, @OmairK

I'm afraid this isn't working like it should. When you pass a string filename, how can filename.name be derived? https://github.com/apache/airflow/pull/7680/files#diff-411e5140735ea97206a32c7b390e2800R459

Or I'm misunderstanding how this should work.

thx for update.

potiuk pushed a commit that referenced this pull request Apr 27, 2020
Fixes a bug introduced in #7680 with passing filename as string
@OmairK
Copy link
Contributor Author

OmairK commented Apr 27, 2020

@thierryturpin Thanks for pointing it out, you are right about this I overlooked this silly mistake. I have made the changes and PR has been merged #8571

kaxil pushed a commit that referenced this pull request Nov 19, 2020
Fixes a bug introduced in #7680 with passing filename as string

(cherry picked from commit 74bc316)
kaxil pushed a commit that referenced this pull request Nov 19, 2020
Fixes a bug introduced in #7680 with passing filename as string

(cherry picked from commit 74bc316)
cfei18 pushed a commit to cfei18/incubator-airflow that referenced this pull request Mar 5, 2021
Fixes a bug introduced in apache#7680 with passing filename as string

(cherry picked from commit 74bc316)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants