Don't cache reinit_modules #5543

JohnGiorgi · 2022-01-16T20:39:51Z

Fixes #5505 (comment)

Changes proposed in this pull request:

Don't cache transformers when reinit_modules is provided.
Removes reinit_modules from the transformer spec
Always load a new model when reinit_modules is not None

Before submitting

I've read and followed all steps in the Making a pull request
section of the CONTRIBUTING docs.
I've updated or added any relevant docstrings following the syntax described in the
Writing docstrings section of the CONTRIBUTING docs.
If this PR fixes a bug, I've added a test that will fail without my fix.
If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

After submitting

All GitHub Actions jobs for my pull request have passed.
codecov/patch reports high test coverage (at least 90%).
You can find this under the "Actions" tab of the pull request once the other checks have finished.

epwalsh

I think one of @dirkgr's points was that we don't want to add a reinitialized transformer to the cache. So maybe only run _model_cache[spec] = transformer when reinit_modules is None.

epwalsh · 2022-01-18T19:18:42Z

CHANGELOG.md

@@ -26,6 +26,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Removed a spurious error message "'torch.cuda' has no attribute '_check_driver'" that would be appear in the logs
  when a `ConfigurationError` for missing GPU was raised.
 - Load model on CPU post training to save GPU memory.
+- Don't cache models with `cached_transformers` when `reinit_modules` is not `None`.


Better to omit this actually since this feature hasn't been released yet.

JohnGiorgi · 2022-01-18T19:31:10Z

I think one of @dirkgr's points was that we don't want to add a reinitialized transformer to the cache. So maybe only run _model_cache[spec] = transformer when reinit_modules is None.

Oh right, good catch. I have fixed this.

dirkgr · 2022-01-18T19:47:43Z

The way reinit should work is that if reinit is specified, it loads the model first using the normal cached_transformers code path, with make_copy = True, and then re-inits the layers it needs. If you already have the weights loaded (which happens a lot in AllenNLP), then it'll be much faster the second time.

I would almost say that we should have an entirely separate function that reinits some layers from a given transformer model. It doesn't have to be part of cached_transformers at all. But I don't know how you're using this downstream, so maybe that's not practical.

JohnGiorgi · 2022-01-20T22:58:04Z

@dirkgr I had originally added this functionality to PretrainedTransformerEmbedder, but then I figured it made sense to move it to cached_transformers, so any part of the library that loads a transformer could reinit some layers/modules. Is there somewhere else in the library it could live that makes more sense? If it is its own function, how would a user access it, e.g. in a training config.

dirkgr · 2022-02-24T21:07:39Z

I'd say for ease of use, just have it in the cached_transformers module as well. While strictly speaking it doesn't have to be used with cached transformers, I think in practice that's what's going to happen.

Dont cache reinit_modules

8f79842

JohnGiorgi changed the title ~~Dont cache reinit_modules~~ Don't cache reinit_modules Jan 16, 2022

JohnGiorgi mentioned this pull request Jan 16, 2022

Reinit layers of pretrained transformer in cached_transformers.get() #5505

Merged

6 tasks

Update changelog

943951e

epwalsh suggested changes Jan 18, 2022

View reviewed changes

JohnGiorgi added 2 commits January 18, 2022 14:26

Update changelog

e36f5aa

Don't cache transformers if reinit_modules is not None

0a47f31

Merge branch 'main' into remove_reinit_module_from_transformer_spec

10dbb48

dirkgr self-assigned this Feb 24, 2022

Merge branch 'main' into remove_reinit_module_from_transformer_spec

6ccabbb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't cache reinit_modules #5543

Don't cache reinit_modules #5543

JohnGiorgi commented Jan 16, 2022 •

edited

Loading

epwalsh left a comment

epwalsh Jan 18, 2022

JohnGiorgi Jan 18, 2022

JohnGiorgi commented Jan 18, 2022

dirkgr commented Jan 18, 2022

JohnGiorgi commented Jan 20, 2022

dirkgr commented Feb 24, 2022

Don't cache reinit_modules #5543

Are you sure you want to change the base?

Don't cache reinit_modules #5543

Conversation

JohnGiorgi commented Jan 16, 2022 • edited Loading

Before submitting

After submitting

epwalsh left a comment

Choose a reason for hiding this comment

epwalsh Jan 18, 2022

Choose a reason for hiding this comment

JohnGiorgi Jan 18, 2022

Choose a reason for hiding this comment

JohnGiorgi commented Jan 18, 2022

dirkgr commented Jan 18, 2022

JohnGiorgi commented Jan 20, 2022

dirkgr commented Feb 24, 2022

JohnGiorgi commented Jan 16, 2022 •

edited

Loading