Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added method to load pretrained models from huggingface #790

Merged
merged 8 commits into from
May 20, 2024

Conversation

Marsmaennchen221
Copy link
Contributor

Summary of Changes

feat: added NeuralNetworkClassifier.load_pretrained_model and NeuralNetworkRegressor.load_pretrained_model to load pretrained models from huggingface. Currently supports only Image models.
feat: added ModelImageSize, ConstantImageSize and VariableImageSize
feat: added support for NeuralNetworkRegressor with images of variable size. If you use a VariableImageSize any image which height and/or width are a multiple of the VariableImageSize are being supported by the model
feat: added NeuralNetworkClassifier.input_size and NeuralNetworkRegressor.input_size
feat: changed Column.get_distinct_values to keep order of values in column

…alNetworkRegressor.load_pretrained_model` to load pretrained models from huggingface. Currently supports only Image models.

feat: added `ModelImageSize`, `ConstantImageSize` and `VariableImageSize`
feat: added support for `NeuralNetworkRegressor` with images of variable size. If you use a `VariableImageSize` any image which height and/or width are a multiple of the `VariableImageSize` are being supported by the model
feat: added `NeuralNetworkClassifier.input_size` and `NeuralNetworkRegressor.input_size`
feat: changed `Column.get_distinct_values` to keep order of values in column
@Marsmaennchen221 Marsmaennchen221 changed the title feat: added NeuralNetworkClassifier.load_pretrained_model and NeuralNetworkRegressor.load_pretrained_model to load pretrained models from huggingface feat: added load for pretrained models from huggingface May 19, 2024
@Marsmaennchen221 Marsmaennchen221 changed the title feat: added load for pretrained models from huggingface feat: added method to load pretrained models from huggingface May 19, 2024
Copy link
Contributor

github-actions bot commented May 19, 2024

🦙 MegaLinter status: ✅ SUCCESS

Descriptor Linter Files Fixed Errors Elapsed time
✅ PYTHON black 20 0 0 1.56s
✅ PYTHON mypy 20 0 3.23s
✅ PYTHON ruff 20 0 0 0.32s
✅ REPOSITORY git_diff yes no 0.32s

See detailed report in MegaLinter reports
Set VALIDATE_ALL_CODEBASE: true in mega-linter.yml to validate all sources, not only the diff

MegaLinter is graciously provided by OX Security

@Marsmaennchen221
Copy link
Contributor Author

@lars-reimann do you have an idea how to test the pretrained models? Currently, those methods are not being tested. Maybe the same approach as in Safe-DS/Datasets#164 could work, but I would appreciate it if we could do this at a later date.

Copy link

codecov bot commented May 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.76%. Comparing base (4a17f76) to head (e3b5df0).
Report is 72 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #790      +/-   ##
==========================================
+ Coverage   97.74%   97.76%   +0.01%     
==========================================
  Files         109      111       +2     
  Lines        5641     5691      +50     
==========================================
+ Hits         5514     5564      +50     
  Misses        127      127              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lars-reimann
Copy link
Member

lars-reimann commented May 19, 2024

do you have an idea how to test the pretrained models?

Not right now, no.

Currently, those methods are not being tested. Maybe the same approach as in Safe-DS/Datasets#164 could work

We've already used all our space for caches in this project (10GB), so that won't help us.

but I would appreciate it if we could do this at a later date.

Sure, no problem. You can add a comment # pragma: no cover to the loading methods.

@Gerhardsa0
Copy link
Contributor

@lars-reimann do you have an idea how to test the pretrained models? Currently, those methods are not being tested. Maybe the same approach as in Safe-DS/Datasets#164 could work, but I would appreciate it if we could do this at a later date.

For testing these models, maybe it is enough for now that we check that the structure is correct.

As example we could load an model into safe-ds and then check if its loaded correctly:
the layers, the size of the layers, layer type etc

Copy link
Member

@lars-reimann lars-reimann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks.

@lars-reimann lars-reimann merged commit dd8394b into main May 20, 2024
12 checks passed
@lars-reimann lars-reimann deleted the huggingface_interface branch May 20, 2024 07:39
lars-reimann pushed a commit that referenced this pull request May 29, 2024
## [0.26.0](v0.25.0...v0.26.0) (2024-05-29)

### Features

* `Table.count_row_if` ([#788](#788)) ([4137131](4137131)), closes [#786](#786)
* added method to load pretrained models from huggingface ([#790](#790)) ([dd8394b](dd8394b))
* infer input size of forward and LSTM layers ([#808](#808)) ([098a07f](098a07f))
* outline around dots of scatterplot ([#785](#785)) ([ee8acf7](ee8acf7))
* remove output conversions ([#792](#792)) ([46f2f5d](46f2f5d)), closes [#732](#732)
* shorten some excessively long names ([#787](#787)) ([1c3ea59](1c3ea59)), closes [#772](#772)
* specify column names in constructor of table transformers ([#795](#795)) ([69a780c](69a780c))
* store window size and forecast horizon in dataset ([#794](#794)) ([f07bc5a](f07bc5a))
* string operations on cells ([#791](#791)) ([4a17f76](4a17f76))

### Bug Fixes

* handling of boolean columns in column statistics ([#778](#778)) ([f61cceb](f61cceb))
* sort x values of line plot ([#782](#782)) ([74d8649](74d8649))
@lars-reimann
Copy link
Member

🎉 This PR is included in version 0.26.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@lars-reimann lars-reimann added the released Included in a release label May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
released Included in a release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants