Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use poetry for dependency management #76

Merged
merged 19 commits into from
Mar 15, 2023
Merged

Use poetry for dependency management #76

merged 19 commits into from
Mar 15, 2023

Conversation

maik-schmidt
Copy link
Contributor

@maik-schmidt maik-schmidt commented Feb 16, 2023

Description

This PR introduces poetry, see merantix-momentum/squirrel-core#111

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring including code style reformatting
  • Other (please describe):

Checklist:

  • I have read the contributing guideline doc (external contributors only)
  • Lint and unit tests pass locally with my changes
  • I have kept the PR small so that it can be easily reviewed
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • All dependency changes have been reflected in the pip requirement files.

TODO:

  • Bump version

@github-actions
Copy link

github-actions bot commented Feb 16, 2023

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@maik-schmidt
Copy link
Contributor Author

I have read the CLA Document and I hereby sign the CLA

@maik-schmidt
Copy link
Contributor Author

recheck

@maik-schmidt
Copy link
Contributor Author

Package metadata diff

 {
-  "filename":"squirrel_datasets_core-0.2.0.tar.gz",
+  "filename":"squirrel_datasets_core-0.2.0.dev90727.tar.gz",
   "metadata_version":"2.1",
-  "name":"squirrel_datasets_core",
-  "version":"0.2.0",
+  "name":"squirrel-datasets-core",
+  "version":"0.2.0.dev90727",
   "summary":"Squirrel public datasets collection",
+  "home_page":"https://merantix-momentum.com/technology/squirrel/",
   "author":"Merantix Momentum",
   "license":"Apache 2.0",
   "classifiers":[
      "Development Status :: 5 - Production/Stable",
      "License :: OSI Approved :: Apache Software License",
-     "Programming Language :: Python :: 3.8",
+     "License :: Other/Proprietary License",
+     "Programming Language :: Python :: 3",
+     "Programming Language :: Python :: 3.9",
+     "Programming Language :: Python :: 3.10",
+     "Programming Language :: Python :: 3.11",
+     "Programming Language :: Python :: 3.9",
      "Typing :: Typed"
   ],
+  "requires_python":">=3.9,<3.12",
+  "requires_dist":[
+     "datasets (>=2.9.0,<3.0.0) ; extra == \"huggingface\" or extra == \"all\"",
+     "deeplake (>=3.2.7,<4.0.0) ; extra == \"deeplake\" or extra == \"all\"",
+     "docutils (>=0.17.1,<0.18.0)",
+     "fire (>=0.5.0,<0.6.0)",
+     "hub (>=3.0.1,<4.0.0) ; extra == \"hub\" or extra == \"all\"",
+     "pillow (>=9.4.0,<10.0.0)",
+     "pyspark (>=3.3.2,<4.0.0) ; extra == \"preprocessing\" or extra == \"all\"",
+     "scipy (>=1.10.0,<2.0.0)",
+     "squirrel-core[gcp,zarr] (==0.18.4.dev776)",
+     "torchvision (>=0.14.1,<0.15.0) ; extra == \"torchvision\" or extra == \"all\""
+  ],
+  "project_urls":[
+     "Documentation, https://squirrel-datasets-core.readthedocs.io/en/latest/",
+     "Repository, https://github.com/merantix-momentum/squirrel-datasets-core"
+  ],
   "provides_extras":[
-     "dev",
-     "preprocessing",
-     "torchvision",
-     "hub",
+     "all",
      "deeplake",
+     "hub",
      "huggingface",
-     "all"
+     "preprocessing",
+     "torchvision"
   ],
   "description_content_type":"text/markdown",
-  "description":"<div align=\"center\">\n  \n# <img src=\"docs/_static/logo.png\" width=\"150px\"> Squirrel Datasets Core\n  \n[![Python](https://img.shields.io/pypi/pyversions/squirrel-datasets-core.svg?style=plastic)](https://badge.fury.io/py/squirrel-datasets-core)\n[![PyPI](https://badge.fury.io/py/squirrel-datasets-core.svg)](https://badge.fury.io/py/squirrel-datasets-core)\n[![Conda](https://img.shields.io/conda/vn/conda-forge/squirrel-datasets-core)](https://anaconda.org/conda-forge/squirrel-datasets-core)\n[![Documentation Status](https://readthedocs.org/projects/squirrel-datasets-core/badge/?version=latest)](https://squirrel-datasets-core.readthedocs.io)\n[![Downloads](https://static.pepy.tech/personalized-badge/squirrel-datasets-core?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads)](https://pepy.tech/project/squirrel-datasets-core)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://raw.githubusercontent.com/merantix-momentum/squirrel-datasets-core/main/LICENSE)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6420214.svg)](https://doi.org/10.5281/zenodo.6420214)\n[![Generic badge](https://img.shields.io/badge/Website-Merantix%20Momentum-blue)](https://merantix-momentum.com)\n[![Slack](https://img.shields.io/badge/slack-chat-green.svg?logo=slack)](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw)\n\n</div>\n\n---\n## What is Squirrel Datasets Core?\n\n`squirrel-datasets-core` is an extension of the [Squirrel](https://github.com/merantix-momentum/squirrel-core) library. `squirrel-datasets-core` is a hub where the user can \n1) explore existing public datasets registered in the data mesh and load them with the ease and speed of `squirrel`\n2) preprocess their datasets and share them with other users. \n\nFor preprocessing, we currently support Spark as the main tool to carry out the task.\n\nIf you have any questions or would like to contribute, join our [Slack community](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw)!\n\n## Installation\nInstall `squirrel-core` and `squirrel-datasets-core` with pip. Note that you can install with different dependencies based on your requirements for squirrel drivers.\nFor using the Torchvision driver call:\n```shell\npip install \"squirrel-core[torch]\"\npip install \"squirrel-datasets-core[torchvision]\"\n```\nFor using the Huggingface or Deeplake driver call:\n```shell\npip install \"squirrel-datasets-core[huggingface]\"\npip install \"squirrel-datasets-core[deeplake]\"\n```\nFor using the Spark preprocessing pipelines call:\n```shell\npip install \"squirrel-datasets-core[preprocessing]\"\n```\nIf you would like to get Squirrel\\'s full functionality, install squirrel-core and squirrel-datasets-core with all their dependencies.\n```shell\npip install \"squirrel-core[all]\"\npip install \"squirrel-datasets-core[all]\"\n```\n\n## Huggingface, Deeplake, Hub and Torchvision Integration\n\nA great feature of squirrel-datasets-core is that you can easily load data from common databases such as Huggingface, Activeloop Deeplake, Hub and Torchvision with one line of code. And you get to enjoy all of Squirrel’s benefits for free! Check out the [documentation](https://squirrel-datasets-core.readthedocs.io/en/latest/driver_integration.html) on how to interface with these libraries.\n```python\nfrom squirrel_datasets_core.driver.huggingface import HuggingfaceDriver\n\nit = HuggingfaceDriver(\"cifar100\").get_iter(\"train\").filter(custom_filter).map(custom_augmentation)\n\n# your train loop\nfor item in it:\n  out = model(item)\n  # ...\n```\n\n## Documentation\n\nVisit our documentation on [Readthedocs](https://squirrel-datasets-core.readthedocs.io).\n\n## Contributing\n`squirrel-datasets-core` is open source and community contributions are welcome!\n\nCheck out the [contribution guide](https://squirrel-datasets-core.readthedocs.io/en/latest/contribute.html) to learn how to get involved. \nPlease follow our recommendations for best practices and code style. \n\n## The Humans behind Squirrel\nWe are [Merantix Momentum](https://merantix-momentum.com/), a team of ~30 machine learning engineers, developing machine learning solutions for industry and research. Each project comes with its own challenges, data types and learnings, but one issue we always faced was scalable data loading, transforming and sharing. We were looking for a solution that would allow us to load the data in a fast and cost-efficient way, while keeping the flexibility to work with any possible dataset and integrate with any API. That\\'s why we build Squirrel – and we hope you\\'ll find it as useful as we do! By the way, [we are hiring](https://merantix-momentum.com/about#jobs)!\n\n\n## Citation\n\nIf you use Squirrel Datasets in your research, please cite Squirrel using:\n```bibtex\n@article{2022squirrelcore,\n  title={Squirrel: A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way.},\n  author={Squirrel Developer Team},\n  journal={GitHub. Note: https://github.com/merantix-momentum/squirrel-core},\n  year={2022}\n}\n```\n"
+  "description":"<div align=\"center\">\n  \n# <img src=\"docs/_static/logo.png\" width=\"150px\"> Squirrel Datasets Core\n  \n[![Python](https://img.shields.io/pypi/pyversions/squirrel-datasets-core.svg?style=plastic)](https://badge.fury.io/py/squirrel-datasets-core)\n[![PyPI](https://badge.fury.io/py/squirrel-datasets-core.svg)](https://badge.fury.io/py/squirrel-datasets-core)\n[![Conda](https://img.shields.io/conda/vn/conda-forge/squirrel-datasets-core)](https://anaconda.org/conda-forge/squirrel-datasets-core)\n[![Documentation Status](https://readthedocs.org/projects/squirrel-datasets-core/badge/?version=latest)](https://squirrel-datasets-core.readthedocs.io)\n[![Downloads](https://static.pepy.tech/personalized-badge/squirrel-datasets-core?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads)](https://pepy.tech/project/squirrel-datasets-core)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://raw.githubusercontent.com/merantix-momentum/squirrel-datasets-core/main/LICENSE)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6420214.svg)](https://doi.org/10.5281/zenodo.6420214)\n[![Generic badge](https://img.shields.io/badge/Website-Merantix%20Momentum-blue)](https://merantix-momentum.com)\n[![Slack](https://img.shields.io/badge/slack-chat-green.svg?logo=slack)](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw)\n\n</div>\n\n---\n## What is Squirrel Datasets Core?\n\n`squirrel-datasets-core` is an extension of the [Squirrel](https://github.com/merantix-momentum/squirrel-core) library. `squirrel-datasets-core` is a hub where the user can \n1) explore existing public datasets registered in the data mesh and load them with the ease and speed of `squirrel`\n2) preprocess their datasets and share them with other users. \n\nFor preprocessing, we currently support Spark as the main tool to carry out the task.\n\nIf you have any questions or would like to contribute, join our [Slack community](https://join.slack.com/t/squirrel-core/shared_invite/zt-14k6sk6sw-zQPHfqAI8Xq5WYd~UqgNFw)!\n\n## Installation\nInstall `squirrel-core` and `squirrel-datasets-core` with pip. Note that you can install with different dependencies based on your requirements for squirrel drivers.\nFor using the Torchvision driver call:\n```shell\npip install \"squirrel-core[torch]\"\npip install \"squirrel-datasets-core[torchvision]\"\n```\nFor using the Huggingface or Deeplake driver call:\n```shell\npip install \"squirrel-datasets-core[huggingface]\"\npip install \"squirrel-datasets-core[deeplake]\"\n```\nFor using the Spark preprocessing pipelines call:\n```shell\npip install \"squirrel-datasets-core[preprocessing]\"\n```\nIf you would like to get Squirrel\\'s full functionality, install squirrel-core and squirrel-datasets-core with all their dependencies.\n```shell\npip install \"squirrel-core[all]\"\npip install \"squirrel-datasets-core[all]\"\n```\n\n## Huggingface, Deeplake, Hub and Torchvision Integration\n\nA great feature of squirrel-datasets-core is that you can easily load data from common databases such as Huggingface, Activeloop Deeplake, Hub and Torchvision with one line of code. And you get to enjoy all of Squirrel’s benefits for free! Check out the [documentation](https://squirrel-datasets-core.readthedocs.io/en/latest/driver_integration.html) on how to interface with these libraries.\n```python\nfrom squirrel_datasets_core.driver.huggingface import HuggingfaceDriver\n\nit = HuggingfaceDriver(\"cifar100\").get_iter(\"train\").filter(custom_filter).map(custom_augmentation)\n\n# your train loop\nfor item in it:\n  out = model(item)\n  # ...\n```\n\n## Documentation\n\nVisit our documentation on [Readthedocs](https://squirrel-datasets-core.readthedocs.io).\n\n## Contributing\n`squirrel-datasets-core` is open source and community contributions are welcome!\n\nCheck out the [contribution guide](https://squirrel-datasets-core.readthedocs.io/en/latest/contribute.html) to learn how to get involved. \nPlease follow our recommendations for best practices and code style. \n\n## The Humans behind Squirrel\nWe are [Merantix Momentum](https://merantix-momentum.com/), a team of ~30 machine learning engineers, developing machine learning solutions for industry and research. Each project comes with its own challenges, data types and learnings, but one issue we always faced was scalable data loading, transforming and sharing. We were looking for a solution that would allow us to load the data in a fast and cost-efficient way, while keeping the flexibility to work with any possible dataset and integrate with any API. That\\'s why we build Squirrel – and we hope you\\'ll find it as useful as we do! By the way, [we are hiring](https://merantix-momentum.com/about#jobs)!\n\n\n## Citation\n\nIf you use Squirrel Datasets in your research, please cite Squirrel using:\n```bibtex\n@article{2022squirrelcore,\n  title={Squirrel: A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way.},\n  author={Squirrel Developer Team},\n  journal={GitHub. Note: https://github.com/merantix-momentum/squirrel-core},\n  year={2022}\n}\n```\n\n"
 }

@AlirezaSohofi AlirezaSohofi requested review from winfried-ripken and removed request for AlirezaSohofi March 2, 2023 17:04
Copy link
Contributor

@winfried-ripken winfried-ripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, LG and happy to try this out. And sorry for the late review

@maik-schmidt maik-schmidt merged commit cc6f891 into main Mar 15, 2023
@maik-schmidt maik-schmidt deleted the maik-poetry-setup branch March 15, 2023 08:27
@github-actions github-actions bot locked and limited conversation to collaborators Mar 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants