Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use py-rattler to fetch repodata in proxy mode #677

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

beenje
Copy link
Contributor

@beenje beenje commented Dec 1, 2023

As mentioned in #660, downloading repodata from a remote server for proxy channels is quite slow right now.

This is a proposal to use py-rattler to download repodata, as it's very efficient.

I didn't find how to access the full json file from the SparseRepoData returned by fetch_repo_data.
I look and return the json file from disk. Maybe there is a better way.

I haven't added any tests yet. Waiting for some feedback first.

@wolfv, any opinion?

@beenje beenje marked this pull request as draft December 1, 2023 14:26
@beenje beenje changed the title Draft: Use py-rattler to fetch repodata in proxy mode Use py-rattler to fetch repodata in proxy mode Dec 1, 2023
@baszalmstra
Copy link
Contributor

@Wackyator Maybe you could think about the API change to be able to get the entire json?

@@ -106,6 +114,32 @@ def json(self):
return json.load(self.file)


def download_repodata(repository: RemoteRepository, channel: str, platform: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function is basically a wrapper around fetch_repo_data, I'd keep this one as an asynchronous function and do the asyncio.run(...) part in the caller for this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. I used aiofiles to make the files operation async then.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@codecov-commenter
Copy link

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (0b49467) 83.61% compared to head (5d59e0c) 83.66%.

Files Patch % Lines
quetz/tasks/mirror.py 83.33% 6 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #677      +/-   ##
==========================================
+ Coverage   83.61%   83.66%   +0.04%     
==========================================
  Files          79       79              
  Lines        6233     6264      +31     
==========================================
+ Hits         5212     5241      +29     
- Misses       1021     1023       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@beenje beenje marked this pull request as ready for review December 3, 2023 20:23
@beenje
Copy link
Contributor Author

beenje commented Dec 4, 2023

For testing purpose I added the test-server directory from ratter repo.

@beenje
Copy link
Contributor Author

beenje commented Dec 7, 2023

@ivergara are you expecting anything more from me on this PR?

@ivergara
Copy link
Collaborator

ivergara commented Dec 7, 2023

@ivergara are you expecting anything more from me on this PR?

No, all looks good from my side. Hopefully, more people chime in before it gets approved and merged.

CC @janjagusch

@janjagusch janjagusch added the enhancement New feature or request label Dec 7, 2023
rattler is very efficient to download repodata
serve_repo_data fixture copied from rattler
dummy_remote_session_object wasn't cleaning after itself
(using return instead of yield)
Test with migration failed with:
Error: The action 'Testing server' has timed out after 5 minutes.
@beenje
Copy link
Contributor Author

beenje commented Jun 4, 2024

I rebased to fix conflicts.

I had to increase the CI timeout (got Error: The action 'Testing server' has timed out after 5 minutes. the first time).

Hope this can be merged or closed.
I'd like to finish #675 (which is based on this PR) as repodata.json.zst is currently never updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants