Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor metadata #126

Merged
merged 20 commits into from
Dec 20, 2022
Merged

Refactor metadata #126

merged 20 commits into from
Dec 20, 2022

Conversation

isabelizimm
Copy link
Contributor

@isabelizimm isabelizimm commented Nov 14, 2022

  • Sets package name in each handler, which allows create_meta to live solely in BaseHandler, rather than in each handler. This also removed the need for describe in each handler.
  • On pinning, creates vetiver_meta inside the metadata, which is not loaded on VetiverModel creation, fixing the bug where user data is duplicated. On the R side, a list is pinned with metadata, rather than the model being pinned, and loading the metadata.
  • Records package versions of required_pkgs on VetiverModel creation
  • Adds more tests for metadata

@isabelizimm isabelizimm marked this pull request as ready for review November 18, 2022 19:31
@isabelizimm isabelizimm changed the title [WIP] rework metadata Refactor metadata Nov 18, 2022
Copy link
Contributor

@machow machow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some questions about the .pkg attribute. Let's spend some of our pairing time today going over how metadata is being created <-> stored!

vetiver/handlers/statsmodels.py Outdated Show resolved Hide resolved
vetiver/vetiver_model.py Outdated Show resolved Hide resolved
vetiver/vetiver_model.py Outdated Show resolved Hide resolved
vetiver/vetiver_model.py Show resolved Hide resolved
@machow
Copy link
Contributor

machow commented Nov 30, 2022

One last high-level thought: I just realized you could stash the model inside a dictionary (like R vetiver does). joblib should be able to serialize / deserialize a dictionary where one field is a model. (But also this way of handling the metadata seems totally reasonable!).

@isabelizimm
Copy link
Contributor Author

One last high-level thought: I just realized you could stash the model inside a dictionary (like R vetiver does). joblib should be able to serialize / deserialize a dictionary where one field is a model. (But also this way of handling the metadata seems totally reasonable!).

This would make sense if I wanted to mimic R and move ptype and required_pkgs into the dictionary as well (effectively removing the need for vetiver_meta inside a pin). Although, I could pin just a dictionary with just the model, nothing else (unclear what value that would add?). Downfalls of this is that you would have to load the entire joblib file to be able to see the ptype and required_pkgs, which might be useful to be able to see without loading this file.

rstudio/vetiver-r#116 is related to this.

@machow
Copy link
Contributor

machow commented Dec 1, 2022

Thanks -- that's helpful context around why you have done it this way!

Copy link
Contributor

@machow machow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is shaping up--thanks for being patient with all the metadata comments 😵‍💫. I left some note along two themes:

  • can we move any metadata logic (including dealing with metadata dictionaries) into VetiverMeta? Can VetiverModel ensure its metadata attr is a VetiverMeta (and add a type hint)?
  • the pkg logic right now is a bit tricky, since a parent class expects it, but a child class might define it as a class attribute. It might be helpful to move this piece into a separate PR to get the metadata stuff in.

vetiver/meta.py Outdated Show resolved Hide resolved
vetiver/meta.py Outdated Show resolved Hide resolved
vetiver/pin_read_write.py Outdated Show resolved Hide resolved
vetiver/handlers/base.py Outdated Show resolved Hide resolved
vetiver/server.py Outdated Show resolved Hide resolved

return meta
pip_name = self.pip_name if hasattr(self, "pip_name") else None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to handle the pkg weirdness, the pkg attribute was removed. if there is a pip name, it will be handled as expected, otherwise it will set to None.

@isabelizimm isabelizimm merged commit 17c3fd4 into main Dec 20, 2022
@isabelizimm isabelizimm deleted the metadata branch May 25, 2023 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants