Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching file/dir metadata for GSClient #275

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Gilthans
Copy link
Contributor

This is an attempt to resolve #176 .
It's a first attempt, I'd mainly like to know if this is what the maintainers had in mind. If it is, I can add some tests - which begs the question, what tests would we like to see?

@codecov-commenter
Copy link

codecov-commenter commented Oct 12, 2022

Codecov Report

Merging #275 (10bb071) into master (b7f6010) will decrease coverage by 0.3%.
The diff coverage is 96.0%.

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #275     +/-   ##
========================================
- Coverage    94.8%   94.4%   -0.4%     
========================================
  Files          20      20             
  Lines        1314    1347     +33     
========================================
+ Hits         1246    1272     +26     
- Misses         68      75      +7     
Impacted Files Coverage Δ
cloudpathlib/local/implementations/gs.py 96.9% <50.0%> (-3.1%) ⬇️
cloudpathlib/gs/gsclient.py 93.7% <97.9%> (-1.0%) ⬇️
cloudpathlib/s3/s3client.py 92.9% <0.0%> (-2.7%) ⬇️

@Gilthans
Copy link
Contributor Author

@pjbull friendly ping :)

@pjbull
Copy link
Member

pjbull commented Oct 20, 2022

Thanks @Gilthans, nice work on a first pass on this. Definitely helpful for thinking about the problem.

Caching is a big enough feature/change that we'd prefer to spec it out before thinking about implementation. For example, are we confident we can implement it the same way across backends? What settings/config/env vars do we provide to end users to control behavior? Is this better implemented in the *Client or *Path objects? What exactly goes in the cache? Is the cache limited in size? Do we provide APIs for clearing the cache?

I think all of these are answerable, but worth making a plan ahead of time to make sure they don't catch us out later and increase the maintenance burden.

@Gilthans
Copy link
Contributor Author

Sure, that does sound reasonable. How does such a discussion happen currently? Is it done strictly by the maintainer team, or is it an open discussion in an issue?
I'd love to get the ball rolling since this feature would be very helpful in our project

@pjbull
Copy link
Member

pjbull commented Nov 2, 2022

@Gilthans We can do this collaboratively in an issue that is opened specifically for metadata caching. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Speed up is_dir on entries returned by iterdir and glob
3 participants