Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load Pyodide runtime through HTTP #2451

Merged
merged 9 commits into from
Aug 9, 2024
Merged

Conversation

garrettgu10
Copy link
Collaborator

@garrettgu10 garrettgu10 commented Jul 26, 2024

This loads the Pyodide bundle through HTTP. It does this in a synchronous way, by creating a new kj::Thread and immediately disowning it, causing the current thread to be blocked on the completion of the new thread.

When a disk cache directory is provided at the command line using pyodide-bundle-disk-cache, the directory is used to cache pyodide bundles.

Existing test cases have been edited to use the pyodide.capnp.bin file generated through bazel, so changes in src/pyodide will be reflected instantly if tested through bazel test .... Otherwise, the correct version according to compat date will be fetched from the internet.

Here's a plan for how to test the disk cache writing (reading is covered by bazel tests) functionality:

mkdir /tmp/pyodide
./bazel-bin/src/workerd/server/workerd serve samples/pyodide/config.capnp --experimental --pyodide-bundle-disk-cache-dir=/tmp/pyodide --verbose
# should print loading pyodide package from internet
./bazel-bin/src/workerd/server/workerd serve samples/pyodide/config.capnp --experimental --pyodide-bundle-disk-cache-dir=/tmp/pyodide --verbose
# should no longer print loading pyodide package from internet & just work

@garrettgu10 garrettgu10 requested review from a team as code owners July 26, 2024 20:21
@garrettgu10 garrettgu10 requested review from dom96 and fhanau July 26, 2024 20:21
@garrettgu10 garrettgu10 marked this pull request as draft July 29, 2024 14:23
@jasnell jasnell added the python Issues/PRs relating to Python Workers label Jul 31, 2024
@garrettgu10 garrettgu10 changed the title [do not merge] Load Pyodide runtime through HTTP Load Pyodide runtime through HTTP Aug 5, 2024
@garrettgu10 garrettgu10 marked this pull request as ready for review August 5, 2024 21:20
@garrettgu10
Copy link
Collaborator Author

One point of jank is that the tip-of-tree pyodide bundle is still named pyodide-2.capnp.bin since the version number is still hardcoded in workerd. Ideally we want this to be named pyodide-latest.capnp.bin or something similar but this would require us to finish the story on compat-date -> pyodide version mapping.

…k cache, using generated tip-of-tree pyodide to pre-populate disk cache
@garrettgu10
Copy link
Collaborator Author

garrettgu10 commented Aug 5, 2024

Another thing we need to figure out before merging -- how can we determine if a worker is a Python worker or not? We need to know this so that we don't fetch the Pyodide bundle when the worker doesn't need it.

@hoodmane
Copy link
Contributor

hoodmane commented Aug 6, 2024

Since it's blocking, why not call fetchPyodideBundle in workerd-api just before it's needed?

src/pyodide/BUILD.bazel Outdated Show resolved Hide resolved
@dom96
Copy link
Collaborator

dom96 commented Aug 6, 2024

finish the story on compat-date -> pyodide version mapping.

This story is complete with #2464. You should be able to use it to determine the bundle filename to download.

how can we determine if a worker is a Python worker or not?

By checking the WorkerBundle. We've got the hasPythonModule for this in EW which you should be able to recreate here.

@garrettgu10
Copy link
Collaborator Author

garrettgu10 commented Aug 6, 2024

@dom96 ah, makes sense. We should probably copy pyodide-2.capnp.bin to pyodide-0.26.0a2.capnp.bin on our bucket so we can use that. I'll do that

Copy link
Collaborator

@dom96 dom96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I think you should integrate PythonSnapshotRelease into here before merging. You probably want to move getPythonSnapshotRelease out of EW and into workerd, then use it here to get the right Python version to load.

src/workerd/server/tests/python/BUILD.bazel Show resolved Hide resolved
src/workerd/server/tests/python/BUILD.bazel Outdated Show resolved Hide resolved
src/workerd/server/tests/python/BUILD.bazel Outdated Show resolved Hide resolved
src/workerd/server/workerd-api.c++ Show resolved Hide resolved
@hoodmane
Copy link
Contributor

hoodmane commented Aug 7, 2024

Could you split "refactor bazel test setup" out into a separate PR?

@garrettgu10 garrettgu10 force-pushed the ggu/load-pyodide-from-http branch 3 times, most recently from e9d6b11 to c312b50 Compare August 7, 2024 19:14
@garrettgu10 garrettgu10 requested a review from dom96 August 7, 2024 20:25
src/workerd/server/tests/python/BUILD.bazel Outdated Show resolved Hide resolved
src/workerd/server/tests/python/BUILD.bazel Outdated Show resolved Hide resolved
src/workerd/server/workerd-api.c++ Outdated Show resolved Hide resolved
src/workerd/server/workerd-api.c++ Outdated Show resolved Hide resolved
src/workerd/server/workerd-api.c++ Outdated Show resolved Hide resolved
src/workerd/server/workerd-api.c++ Outdated Show resolved Hide resolved
Comment on lines +542 to +544
if (fetchPyodideBundle(impl->pythonConfig, "dev"_kj)) {
modules->addBuiltinBundle(getPyodideBundle("dev"_kj), kj::none);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to select between "dev" and the appropriate version by setting an appropriate compat flag so that tests can be a bit more declarative. I think it would be confusing if you try to test a specific version but the test uses latest because it exists. But we can deal with that in a followup.

}
}

bool fetchPyodideBundle(const api::pyodide::PythonConfig& pyConfig, kj::StringPtr version) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be tidier to have fetchPyodideBundle return a kj::Maybe<> directly, but we can tidy up that sort of thing in followup PRs.

Copy link
Contributor

@hoodmane hoodmane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @garrettgu10! Let's merge this.

@garrettgu10 garrettgu10 merged commit 5c21d27 into main Aug 9, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Issues/PRs relating to Python Workers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants