-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use poetry for dependency management #76
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅ |
I have read the CLA Document and I hereby sign the CLA |
recheck |
Package metadata diff {
- "filename":"squirrel_datasets_core-0.2.0.tar.gz",
+ "filename":"squirrel_datasets_core-0.2.0.dev90727.tar.gz",
"metadata_version":"2.1",
- "name":"squirrel_datasets_core",
- "version":"0.2.0",
+ "name":"squirrel-datasets-core",
+ "version":"0.2.0.dev90727",
"summary":"Squirrel public datasets collection",
+ "home_page":"https://merantix-momentum.com/technology/squirrel/",
"author":"Merantix Momentum",
"license":"Apache 2.0",
"classifiers":[
"Development Status :: 5 - Production/Stable",
"License :: OSI Approved :: Apache Software License",
- "Programming Language :: Python :: 3.8",
+ "License :: Other/Proprietary License",
+ "Programming Language :: Python :: 3",
+ "Programming Language :: Python :: 3.9",
+ "Programming Language :: Python :: 3.10",
+ "Programming Language :: Python :: 3.11",
+ "Programming Language :: Python :: 3.9",
"Typing :: Typed"
],
+ "requires_python":">=3.9,<3.12",
+ "requires_dist":[
+ "datasets (>=2.9.0,<3.0.0) ; extra == \"huggingface\" or extra == \"all\"",
+ "deeplake (>=3.2.7,<4.0.0) ; extra == \"deeplake\" or extra == \"all\"",
+ "docutils (>=0.17.1,<0.18.0)",
+ "fire (>=0.5.0,<0.6.0)",
+ "hub (>=3.0.1,<4.0.0) ; extra == \"hub\" or extra == \"all\"",
+ "pillow (>=9.4.0,<10.0.0)",
+ "pyspark (>=3.3.2,<4.0.0) ; extra == \"preprocessing\" or extra == \"all\"",
+ "scipy (>=1.10.0,<2.0.0)",
+ "squirrel-core[gcp,zarr] (==0.18.4.dev776)",
+ "torchvision (>=0.14.1,<0.15.0) ; extra == \"torchvision\" or extra == \"all\""
+ ],
+ "project_urls":[
+ "Documentation, https://squirrel-datasets-core.readthedocs.io/en/latest/",
+ "Repository, https://github.com/merantix-momentum/squirrel-datasets-core"
+ ],
"provides_extras":[
- "dev",
- "preprocessing",
- "torchvision",
- "hub",
+ "all",
"deeplake",
+ "hub",
"huggingface",
- "all"
+ "preprocessing",
+ "torchvision"
],
"description_content_type":"text/markdown",
- "description":"<div align=\"center\">\n \n# <img src=\"docs/_static/logo.png\" width=\"150px\"> Squirrel Datasets Core\n \n[![Python](https://img.shields.io/pypi/pyversions/squirrel-datasets-core.svg?style=plastic)](https://badge.fury.io/py/squirrel-datasets-core)\n[![PyPI](https://badge.fury.io/py/squirrel-datasets-core.svg)](https://badge.fury.io/py/squirrel-datasets-core)\n[![Conda](https://img.shields.io/conda/vn/conda-forge/squirrel-datasets-core)](https://anaconda.org/conda-forge/squirrel-datasets-core)\n[![Documentation Status](https://readthedocs.org/projects/squirrel-datasets-core/badge/?version=latest)](https://squirrel-datasets-core.readthedocs.io)\n[![Downloads](https://static.pepy.tech/personalized-badge/squirrel-datasets-core?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads)](https://pepy.tech/project/squirrel-datasets-core)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://raw.githubusercontent.com/merantix-momentum/squirrel-datasets-core/main/LICENSE)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6420214.svg)](https://doi.org/10.5281/zenodo.6420214)\n[![Generic badge](https://img.shields.io/badge/Website-Merantix%20Momentum-blue)](https://merantix-momentum.com)\n[![Slack](https://img.shields.io/badge/slack-chat-green.svg?logo=slack)](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw)\n\n</div>\n\n---\n## What is Squirrel Datasets Core?\n\n`squirrel-datasets-core` is an extension of the [Squirrel](https://github.com/merantix-momentum/squirrel-core) library. `squirrel-datasets-core` is a hub where the user can \n1) explore existing public datasets registered in the data mesh and load them with the ease and speed of `squirrel`\n2) preprocess their datasets and share them with other users. \n\nFor preprocessing, we currently support Spark as the main tool to carry out the task.\n\nIf you have any questions or would like to contribute, join our [Slack community](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw)!\n\n## Installation\nInstall `squirrel-core` and `squirrel-datasets-core` with pip. Note that you can install with different dependencies based on your requirements for squirrel drivers.\nFor using the Torchvision driver call:\n```shell\npip install \"squirrel-core[torch]\"\npip install \"squirrel-datasets-core[torchvision]\"\n```\nFor using the Huggingface or Deeplake driver call:\n```shell\npip install \"squirrel-datasets-core[huggingface]\"\npip install \"squirrel-datasets-core[deeplake]\"\n```\nFor using the Spark preprocessing pipelines call:\n```shell\npip install \"squirrel-datasets-core[preprocessing]\"\n```\nIf you would like to get Squirrel\\'s full functionality, install squirrel-core and squirrel-datasets-core with all their dependencies.\n```shell\npip install \"squirrel-core[all]\"\npip install \"squirrel-datasets-core[all]\"\n```\n\n## Huggingface, Deeplake, Hub and Torchvision Integration\n\nA great feature of squirrel-datasets-core is that you can easily load data from common databases such as Huggingface, Activeloop Deeplake, Hub and Torchvision with one line of code. And you get to enjoy all of Squirrel’s benefits for free! Check out the [documentation](https://squirrel-datasets-core.readthedocs.io/en/latest/driver_integration.html) on how to interface with these libraries.\n```python\nfrom squirrel_datasets_core.driver.huggingface import HuggingfaceDriver\n\nit = HuggingfaceDriver(\"cifar100\").get_iter(\"train\").filter(custom_filter).map(custom_augmentation)\n\n# your train loop\nfor item in it:\n out = model(item)\n # ...\n```\n\n## Documentation\n\nVisit our documentation on [Readthedocs](https://squirrel-datasets-core.readthedocs.io).\n\n## Contributing\n`squirrel-datasets-core` is open source and community contributions are welcome!\n\nCheck out the [contribution guide](https://squirrel-datasets-core.readthedocs.io/en/latest/contribute.html) to learn how to get involved. \nPlease follow our recommendations for best practices and code style. \n\n## The Humans behind Squirrel\nWe are [Merantix Momentum](https://merantix-momentum.com/), a team of ~30 machine learning engineers, developing machine learning solutions for industry and research. Each project comes with its own challenges, data types and learnings, but one issue we always faced was scalable data loading, transforming and sharing. We were looking for a solution that would allow us to load the data in a fast and cost-efficient way, while keeping the flexibility to work with any possible dataset and integrate with any API. That\\'s why we build Squirrel – and we hope you\\'ll find it as useful as we do! By the way, [we are hiring](https://merantix-momentum.com/about#jobs)!\n\n\n## Citation\n\nIf you use Squirrel Datasets in your research, please cite Squirrel using:\n```bibtex\n@article{2022squirrelcore,\n title={Squirrel: A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way.},\n author={Squirrel Developer Team},\n journal={GitHub. Note: https://github.com/merantix-momentum/squirrel-core},\n year={2022}\n}\n```\n"
+ "description":"<div align=\"center\">\n \n# <img src=\"docs/_static/logo.png\" width=\"150px\"> Squirrel Datasets Core\n \n[![Python](https://img.shields.io/pypi/pyversions/squirrel-datasets-core.svg?style=plastic)](https://badge.fury.io/py/squirrel-datasets-core)\n[![PyPI](https://badge.fury.io/py/squirrel-datasets-core.svg)](https://badge.fury.io/py/squirrel-datasets-core)\n[![Conda](https://img.shields.io/conda/vn/conda-forge/squirrel-datasets-core)](https://anaconda.org/conda-forge/squirrel-datasets-core)\n[![Documentation Status](https://readthedocs.org/projects/squirrel-datasets-core/badge/?version=latest)](https://squirrel-datasets-core.readthedocs.io)\n[![Downloads](https://static.pepy.tech/personalized-badge/squirrel-datasets-core?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads)](https://pepy.tech/project/squirrel-datasets-core)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://raw.githubusercontent.com/merantix-momentum/squirrel-datasets-core/main/LICENSE)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6420214.svg)](https://doi.org/10.5281/zenodo.6420214)\n[![Generic badge](https://img.shields.io/badge/Website-Merantix%20Momentum-blue)](https://merantix-momentum.com)\n[![Slack](https://img.shields.io/badge/slack-chat-green.svg?logo=slack)](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw)\n\n</div>\n\n---\n## What is Squirrel Datasets Core?\n\n`squirrel-datasets-core` is an extension of the [Squirrel](https://github.com/merantix-momentum/squirrel-core) library. `squirrel-datasets-core` is a hub where the user can \n1) explore existing public datasets registered in the data mesh and load them with the ease and speed of `squirrel`\n2) preprocess their datasets and share them with other users. \n\nFor preprocessing, we currently support Spark as the main tool to carry out the task.\n\nIf you have any questions or would like to contribute, join our [Slack community](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw)!\n\n## Installation\nInstall `squirrel-core` and `squirrel-datasets-core` with pip. Note that you can install with different dependencies based on your requirements for squirrel drivers.\nFor using the Torchvision driver call:\n```shell\npip install \"squirrel-core[torch]\"\npip install \"squirrel-datasets-core[torchvision]\"\n```\nFor using the Huggingface or Deeplake driver call:\n```shell\npip install \"squirrel-datasets-core[huggingface]\"\npip install \"squirrel-datasets-core[deeplake]\"\n```\nFor using the Spark preprocessing pipelines call:\n```shell\npip install \"squirrel-datasets-core[preprocessing]\"\n```\nIf you would like to get Squirrel\\'s full functionality, install squirrel-core and squirrel-datasets-core with all their dependencies.\n```shell\npip install \"squirrel-core[all]\"\npip install \"squirrel-datasets-core[all]\"\n```\n\n## Huggingface, Deeplake, Hub and Torchvision Integration\n\nA great feature of squirrel-datasets-core is that you can easily load data from common databases such as Huggingface, Activeloop Deeplake, Hub and Torchvision with one line of code. And you get to enjoy all of Squirrel’s benefits for free! Check out the [documentation](https://squirrel-datasets-core.readthedocs.io/en/latest/driver_integration.html) on how to interface with these libraries.\n```python\nfrom squirrel_datasets_core.driver.huggingface import HuggingfaceDriver\n\nit = HuggingfaceDriver(\"cifar100\").get_iter(\"train\").filter(custom_filter).map(custom_augmentation)\n\n# your train loop\nfor item in it:\n out = model(item)\n # ...\n```\n\n## Documentation\n\nVisit our documentation on [Readthedocs](https://squirrel-datasets-core.readthedocs.io).\n\n## Contributing\n`squirrel-datasets-core` is open source and community contributions are welcome!\n\nCheck out the [contribution guide](https://squirrel-datasets-core.readthedocs.io/en/latest/contribute.html) to learn how to get involved. \nPlease follow our recommendations for best practices and code style. \n\n## The Humans behind Squirrel\nWe are [Merantix Momentum](https://merantix-momentum.com/), a team of ~30 machine learning engineers, developing machine learning solutions for industry and research. Each project comes with its own challenges, data types and learnings, but one issue we always faced was scalable data loading, transforming and sharing. We were looking for a solution that would allow us to load the data in a fast and cost-efficient way, while keeping the flexibility to work with any possible dataset and integrate with any API. That\\'s why we build Squirrel – and we hope you\\'ll find it as useful as we do! By the way, [we are hiring](https://merantix-momentum.com/about#jobs)!\n\n\n## Citation\n\nIf you use Squirrel Datasets in your research, please cite Squirrel using:\n```bibtex\n@article{2022squirrelcore,\n title={Squirrel: A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way.},\n author={Squirrel Developer Team},\n journal={GitHub. Note: https://github.com/merantix-momentum/squirrel-core},\n year={2022}\n}\n```\n\n"
} |
AlirezaSohofi
requested review from
winfried-ripken
and removed request for
AlirezaSohofi
March 2, 2023 17:04
winfried-ripken
approved these changes
Mar 13, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, LG and happy to try this out. And sorry for the late review
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR introduces poetry, see merantix-momentum/squirrel-core#111
Type of change
Checklist:
TODO: