Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simple FSDP support to MNIST example LightningModule #604

Merged
merged 1 commit into from
Sep 25, 2023

Conversation

amorehead
Copy link
Contributor

@amorehead amorehead commented Sep 19, 2023

What does this PR do?

  • Updates the mnist_module.py to reference the Trainer's version of parameters() e.g., for FSDP support
  • Note: Without referencing parameters() in this way, Lightning strategies such as FSDP will not be able to successfully wrap one's model parameters. Even more importantly: if one were to train a model while referencing self.parameters() and then attempt to re-train the model when referencing self.trainer.model.parameters(), Lightning 2.0 will (currently) raise an Exception, preventing one from resuming any training with the original checkpoint. That is why I think this change is important for everyone to use as a default.

Before submitting

  • Did you make sure title is self-explanatory and the description concisely explains the PR?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you test your PR locally with pytest command?
  • Did you run pre-commit hooks with pre-commit run -a command?

Did you have fun?

Make sure you had fun coding 🙃

* Updates the `mnist_module.py` to reference `Trainer`'s version of `parameters()` e.g., FSDP support
@amorehead amorehead changed the title Adds simple FSDP support to MNIST example Trainer Add simple FSDP support to MNIST example Trainer Sep 19, 2023
@amorehead amorehead changed the title Add simple FSDP support to MNIST example Trainer Add simple FSDP support to MNIST example LightningModule Sep 19, 2023
@codecov-commenter
Copy link

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (1fb5405) 83.24% compared to head (672a3d4) 83.24%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #604   +/-   ##
=======================================
  Coverage   83.24%   83.24%           
=======================================
  Files          11       11           
  Lines         376      376           
=======================================
  Hits          313      313           
  Misses         63       63           
Files Changed Coverage Δ
src/models/mnist_module.py 96.96% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Owner

@ashleve ashleve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for fix!

@ashleve ashleve merged commit bddbc24 into ashleve:main Sep 25, 2023
11 checks passed
@amorehead amorehead deleted the patch-5 branch September 25, 2023 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants