Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

[FOR DISCUSSION] Automatically capture params on object construction #4413

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

matt-gardner
Copy link
Contributor

As I was writing parts of the guide, one thing that bothered me was that we don't have a good story around model saving and loading when you're not using config files. Also, I saw examples of huggingface code where you could just call .save_pretrained() on your in-python object and have things just work. For us to do that, we have to be able to save objects for which we don't have implementation code, but I think we can still do it. I got to thinking about this problem this morning, and was playing around with something that actually works, pretty simply.

Basic idea: use a single metaclass on FromParams to make it so that all FromParams objects (including subclasses) have a _params field, which stores the config that they would have been created with. Then the model can just implement a simple .save() method which saves _params as a json object, then calls the archive code. We could probably even just remove the archive code altogether and have it live as methods on Model, because this would make it much easier.

I implemented a quick prototype that works for simple classes. I put it as a pull request instead of an issue so that the code could be more easily commented on. There are three major hurdles that I see to getting this to actually work, which may or may not be surmountable:

  1. We allow subclasses to take **kwargs, and inspect the superclass to grab those arguments. This one should be pretty straightforward - we can directly use the existing logic to get the parameter list, and that might just be enough by itself.

  2. When we use non-standard constructors, like from_partial_objects, or what we do in Vocabulary. We have to know a priori what things are registered that way, and override them accordingly. This might be possible if we have this metaclass inspect the Registry somehow.

  3. We need to know which parameters don't need to get saved in the config file. Things like the Vocabulary for the Model object (because it's actually specified higher up in the config). We don't have a way of programmatically detecting this, and I'm not sure how to do it. The goal of this is just for use in saving models, so we could feasibly special-case a thing or two (we'd want to serialize the vocabulary separately anyway), but that's not ideal. This one needs some more thought.

A possible side-benefit of this is being able to pickle and recover some objects for use in multi-processing using just the saved _params.

@epwalsh, @dirkgr, what do you think?

@epwalsh
Copy link
Member

epwalsh commented Jun 29, 2020

For us to do that, we have to be able to save objects for which we don't have implementation code ...

I'm not sure I follow this, can you give me an example where we wouldn't have implementation code? Is this why we can't just use pickle?

@matt-gardner
Copy link
Contributor Author

For us to do that, we have to be able to save objects for which we don't have implementation code ...

I'm not sure I follow this, can you give me an example where we wouldn't have implementation code? Is this why we can't just use pickle?

The reason this works in huggingface's code is because everything relies on an opaque config object. We wanted to avoid this, using typed python objects directly in constructors, instead of some untyped config. The tricky thing is if you don't have a blanket config object, you need some other way of saying what state needs to be saved.

You're right that the way I said this the first time was wrong.

@epwalsh
Copy link
Member

epwalsh commented Jun 29, 2020

Gotcha.

And what would happen with the vocab when you call Model.save()? Would you have to serialize the vocab separately? Then I imagine you would have to load the vocab separately and pass it when calling Model.load(vocab=vocab) or something, right?

@dirkgr
Copy link
Member

dirkgr commented Jun 29, 2020

Why can't pickle do this? Or dill? I've been using dill to serialize models for a long time, including AllenNLP models. It does not save the implementation logic of the objects. It has ways to find which classes the objects belong to when it de-serializes. I don't know how it does this, but it works.

This also solves the issue that models would save the whole vocabulary, even if all they care about is the length to define an embedding size.

@matt-gardner
Copy link
Contributor Author

For vocab: Model.save() would have the same vocab saving logic that's currently spread throughout the trainer and the archival code. Model.load() would take an archive file that includes the vocabulary, as done in the archival code now.

For Dirk's question: yes, dill could work. It's inconsistent, though, and makes it harder to, e.g., support demos that were trained using a python script and saved with dill. Unless we change all of our archival logic to just use dill.

Hmm, a major problem, though, is that the model archive currently also has information about the dataset reader, which is required by the demo / predictor code. So you can't just call Model.save()...

@schmmd schmmd changed the base branch from master to main December 23, 2020 18:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants