-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Don't cache reinit_modules #5543
base: main
Are you sure you want to change the base?
Don't cache reinit_modules #5543
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one of @dirkgr's points was that we don't want to add a reinitialized transformer to the cache. So maybe only run _model_cache[spec] = transformer
when reinit_modules is None
.
CHANGELOG.md
Outdated
@@ -26,6 +26,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |||
- Removed a spurious error message "'torch.cuda' has no attribute '_check_driver'" that would be appear in the logs | |||
when a `ConfigurationError` for missing GPU was raised. | |||
- Load model on CPU post training to save GPU memory. | |||
- Don't cache models with `cached_transformers` when `reinit_modules` is not `None`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to omit this actually since this feature hasn't been released yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed!
Oh right, good catch. I have fixed this. |
The way I would almost say that we should have an entirely separate function that reinits some layers from a given transformer model. It doesn't have to be part of |
@dirkgr I had originally added this functionality to |
I'd say for ease of use, just have it in the |
Fixes #5505 (comment)
Changes proposed in this pull request:
reinit_modules
is provided.reinit_modules
from the transformer specreinit_modules
is notNone
Before submitting
section of the
CONTRIBUTING
docs.Writing docstrings section of the
CONTRIBUTING
docs.After submitting
codecov/patch
reports high test coverage (at least 90%).You can find this under the "Actions" tab of the pull request once the other checks have finished.