[FOR DISCUSSION] Automatically capture params on object construction #4413

matt-gardner · 2020-06-29T15:46:30Z

As I was writing parts of the guide, one thing that bothered me was that we don't have a good story around model saving and loading when you're not using config files. Also, I saw examples of huggingface code where you could just call .save_pretrained() on your in-python object and have things just work. For us to do that, we have to be able to save objects for which we don't have implementation code, but I think we can still do it. I got to thinking about this problem this morning, and was playing around with something that actually works, pretty simply.

Basic idea: use a single metaclass on FromParams to make it so that all FromParams objects (including subclasses) have a _params field, which stores the config that they would have been created with. Then the model can just implement a simple .save() method which saves _params as a json object, then calls the archive code. We could probably even just remove the archive code altogether and have it live as methods on Model, because this would make it much easier.

I implemented a quick prototype that works for simple classes. I put it as a pull request instead of an issue so that the code could be more easily commented on. There are three major hurdles that I see to getting this to actually work, which may or may not be surmountable:

We allow subclasses to take **kwargs, and inspect the superclass to grab those arguments. This one should be pretty straightforward - we can directly use the existing logic to get the parameter list, and that might just be enough by itself.
When we use non-standard constructors, like from_partial_objects, or what we do in Vocabulary. We have to know a priori what things are registered that way, and override them accordingly. This might be possible if we have this metaclass inspect the Registry somehow.
We need to know which parameters don't need to get saved in the config file. Things like the Vocabulary for the Model object (because it's actually specified higher up in the config). We don't have a way of programmatically detecting this, and I'm not sure how to do it. The goal of this is just for use in saving models, so we could feasibly special-case a thing or two (we'd want to serialize the vocabulary separately anyway), but that's not ideal. This one needs some more thought.

A possible side-benefit of this is being able to pickle and recover some objects for use in multi-processing using just the saved _params.

@epwalsh, @dirkgr, what do you think?

epwalsh · 2020-06-29T16:16:16Z

For us to do that, we have to be able to save objects for which we don't have implementation code ...

I'm not sure I follow this, can you give me an example where we wouldn't have implementation code? Is this why we can't just use pickle?

matt-gardner · 2020-06-29T16:23:25Z

For us to do that, we have to be able to save objects for which we don't have implementation code ...

I'm not sure I follow this, can you give me an example where we wouldn't have implementation code? Is this why we can't just use pickle?

The reason this works in huggingface's code is because everything relies on an opaque config object. We wanted to avoid this, using typed python objects directly in constructors, instead of some untyped config. The tricky thing is if you don't have a blanket config object, you need some other way of saying what state needs to be saved.

You're right that the way I said this the first time was wrong.

epwalsh · 2020-06-29T16:45:24Z

Gotcha.

And what would happen with the vocab when you call Model.save()? Would you have to serialize the vocab separately? Then I imagine you would have to load the vocab separately and pass it when calling Model.load(vocab=vocab) or something, right?

dirkgr · 2020-06-29T16:50:53Z

Why can't pickle do this? Or dill? I've been using dill to serialize models for a long time, including AllenNLP models. It does not save the implementation logic of the objects. It has ways to find which classes the objects belong to when it de-serializes. I don't know how it does this, but it works.

This also solves the issue that models would save the whole vocabulary, even if all they care about is the length to define an embedding size.

matt-gardner · 2020-06-29T17:05:04Z

For vocab: Model.save() would have the same vocab saving logic that's currently spread throughout the trainer and the archival code. Model.load() would take an archive file that includes the vocabulary, as done in the archival code now.

For Dirk's question: yes, dill could work. It's inconsistent, though, and makes it harder to, e.g., support demos that were trained using a python script and saved with dill. Unless we change all of our archival logic to just use dill.

Hmm, a major problem, though, is that the model archive currently also has information about the dataset reader, which is required by the demo / predictor code. So you can't just call Model.save()...

Made a proof of concept

5d460bf

schmmd changed the base branch from master to main December 23, 2020 18:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FOR DISCUSSION] Automatically capture params on object construction #4413

[FOR DISCUSSION] Automatically capture params on object construction #4413

matt-gardner commented Jun 29, 2020

epwalsh commented Jun 29, 2020

matt-gardner commented Jun 29, 2020

epwalsh commented Jun 29, 2020

dirkgr commented Jun 29, 2020

matt-gardner commented Jun 29, 2020

[FOR DISCUSSION] Automatically capture params on object construction #4413

Are you sure you want to change the base?

[FOR DISCUSSION] Automatically capture params on object construction #4413

Conversation

matt-gardner commented Jun 29, 2020

epwalsh commented Jun 29, 2020

matt-gardner commented Jun 29, 2020

epwalsh commented Jun 29, 2020

dirkgr commented Jun 29, 2020

matt-gardner commented Jun 29, 2020