Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce s3hook memory usage #37886

Merged
merged 3 commits into from
Mar 6, 2024
Merged

Reduce s3hook memory usage #37886

merged 3 commits into from
Mar 6, 2024

Conversation

ellisms
Copy link
Contributor

@ellisms ellisms commented Mar 4, 2024


Closes #35449
Cached an S3 resource property for functions still using resources. Ran a simple loop of hook.get_bucket() for 5 buckets and captured memory data with memray. Memory usage went from 34.5mb without caching to 11.5mb with cached property. This PR doesn't address removing the use of Resources, since it would cause a breaking change for anyone directly calling hook.get_bucket() or hook.get_key().
Cleaned up some of the other typing code and imports.

@boring-cyborg boring-cyborg bot added area:providers provider:amazon-aws AWS/Amazon - related issues labels Mar 4, 2024
@eladkal
Copy link
Contributor

eladkal commented Mar 4, 2024

This PR doesn't address removing the use of Resources, since it would cause a breaking change for anyone directly calling hook.get_bucket() or hook.get_key().

If we think it's right to remove we can introduce a breaking change release.

airflow/providers/amazon/aws/hooks/s3.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/hooks/s3.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/hooks/s3.py Show resolved Hide resolved
airflow/providers/amazon/aws/hooks/s3.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ferruzzi ferruzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good start. I'd love to see Resources replaced with Clients some day to bring it in line with the other hooks, but that caching is nice and I like the cleanup work.

@Taragolis
Copy link
Contributor

It's a good start. I'd love to see Resources replaced with Clients

As far a I could see this PR do not replace boto3.resource by a boto3.client, in general it should reduce a bit of memory usage

@vincbeck vincbeck merged commit e7214fd into apache:main Mar 6, 2024
55 checks passed
howardyoo pushed a commit to howardyoo/airflow that referenced this pull request Mar 18, 2024
utkarsharma2 pushed a commit to astronomer/airflow that referenced this pull request Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce memory usage in S3Hook
7 participants