You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The README shows how you'd do it in plain torch yes. PyTorch Lightning already supports checkpointing out of the box, so you'd literally have to do nothing.
If your train_dataloader() methods return a StreamingDataLoader then its state will be saved into the checkpoints and when you resume from one via trainer.fit(..., ckpt_path=...) it loads the dataloader state back to continue.
Yes we can make it more explicit in docs and examples etc, good idea!
awaelchli
changed the title
Add example of saving / resuming DataLoader state with PyTorch Litghtning
Add example of saving / resuming DataLoader state with PyTorch Lightning
Jul 22, 2024
📚 Documentation
Right now, the README shows an example of pausing and resuming the StreamingDataLoader state within a simple for loop over the dataloader.
It's not clear to me how to adapt that example to work with PyTorch Lightning, where the training loop is abstracted away.
My first guess is to make a callback for the Trainer -- but it would be great to have a simple example of how to do this in PyTorch Lightning.
The text was updated successfully, but these errors were encountered: