-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
udf: add dependency list in comment block? #237
Comments
just recently "PEP 722 Dependency specification for single-file scripts" was initiated. E.g. see https://discuss.python.org/t/pep-722-dependency-specification-for-single-file-scripts/29905 I haven't read much into the state of that proposal, but we might want to align with it for UDF dependency declarations |
Just checked on this again and apparently PEP 722 – Dependency specification for single-file scripts was rejected.
It is accepted, so an interesting option to consider for this issue. It would look like this (comment at top of UDF file):
|
pip is now available in the image, so this would work:
|
Implementation proposal:
Advantage:
|
The basics are now in place The UDF has a comment at the top which declares some expected packages like this (PEP 723): # /// script
# dependencies = ["duviz"]
# /// This duviz is just a random small package that has nothing to do with EO or GIS. It is automatically installed at the start of the batch job in a folder |
next steps:
|
Some notes about
Current implementation does something like with this illustration (in
There is a variation with
However the
So |
Verified that it works with compiled packages and github zip archives as well, e.g. used this UDF: udf_code = """
# /// script
# dependencies = [
# # An github zip archive based dependency:
# "duviz @ https://github.com/soxofaan/duviz/archive/refs/tags/v3.2.0.zip",
# # Rust-based compiled package
# "ruff",
# ]
# ///
import re
import xarray
import duviz
import ruff
def apply_datacube(cube: xarray.DataArray, context: dict) -> xarray.DataArray:
# Get a step size based on the number of things in the imported deps
step = sum("u" in x for x in dir(duviz)) + len(dir(ruff))
# Zero out pixels every `step` along x axis
cube[{"x": slice(None, None, step)}] = 0
return cube
""" |
More in-depth documentation has been added to https://open-eo.github.io/openeo-python-client/udf.html#udf-dependency-management |
status update with some remaining subtasks:
|
This morning, I could verify that it now also works on Terrascope deploy (dev) |
Integration tests are now in place for both Terrascope and CDSE I'm going to close this ticket. Additional tickets were created for the remaining work |
what about:
https://datashim-io.github.io/datashim/Archive-based-Datasets/
The text was updated successfully, but these errors were encountered: