📚 v1 - Docs Add the feature to resume training from a previous training #1558

shrinand1996 · 2023-12-20T08:26:25Z

What is the motivation for this task?

It will be great if we can resume training (for EfficientNet) since it takes time to train. Loading the pre trained weight will help get to desired accuracy quicker .

Describe the solution you'd like

An option in the config file that can load the pre trained weight and then resume training from there.

Additional context

No response

blaz-r · 2024-01-01T19:18:00Z

Hello. You can achieve this by adding resume_from_checkpoint somewhere in your config and providing a path to weights that you want to resume from.

This is not that well documented I thing. So I'll note it down as something that should be improved in v1 docs.

samet-akcay · 2024-01-02T10:46:45Z

Yes, let's keep this issue open as a doc improvement.

Shakib-IO · 2024-03-04T02:18:17Z

Hi @samet-akcay @blaz-r
I want to work on this.

samet-akcay · 2024-03-04T05:41:19Z

@Shakib-IO, sure go ahead. Thanks for your interest!

Shakib-IO · 2024-03-07T06:24:45Z

Hi @blaz-r
I'm unsure where I should insert resume_from_checkpoint in the documentation. Any insight you could provide would be greatly appreciated. I found a relevant source here.

Thanks!

blaz-r · 2024-03-07T10:13:49Z

Hi @Shakib-IO resume_from_checkpoint was part of config before the v1. With v1, this is a bit different. You can pass the checkpoint to fit directly:

anomalib/src/anomalib/engine/engine.py

Lines 462 to 497 in 34b3a90

    
               def fit( 
        
                   self, 
        
                   model: AnomalyModule, 
        
                   train_dataloaders: TRAIN_DATALOADERS | None = None, 
        
                   val_dataloaders: EVAL_DATALOADERS | None = None, 
        
                   datamodule: AnomalibDataModule | None = None, 
        
                   ckpt_path: str | Path | None = None, 
        
               ) -> None: 
        
                   """Fit the model using the trainer. 
        
                   Args: 
        
                       model (AnomalyModule): Model to be trained. 
        
                       train_dataloaders (TRAIN_DATALOADERS | None, optional): Train dataloaders. 
        
                           Defaults to None. 
        
                       val_dataloaders (EVAL_DATALOADERS | None, optional): Validation dataloaders. 
        
                           Defaults to None. 
        
                       datamodule (AnomalibDataModule | None, optional): Lightning datamodule. 
        
                           If provided, dataloaders will be instantiated from this. 
        
                           Defaults to None. 
        
                       ckpt_path (str | None, optional): Checkpoint path. If provided, the model will be loaded from this path. 
        
                           Defaults to None. 
        
                   CLI Usage: 
        
                       1. you can pick a model, and you can run through the MVTec dataset. 
        
                           ```python 
        
                           anomalib fit --model anomalib.models.Padim 
        
                           ``` 
        
                       2. Of course, you can override the various values with commands. 
        
                           ```python 
        
                           anomalib fit --model anomalib.models.Padim --data <CONFIG | CLASS_PATH_OR_NAME> --trainer.max_epochs 3 
        
                           ``` 
        
                       4. If you have a ready configuration file, run it like this. 
        
                           ```python 
        
                           anomalib fit --config <config_file_path> 
        
                           ``` 
        
                   """

so passing --ckpt_path CKPT_PATH.
As for setting this via config, I am not sure how this exactly it's handled there.

ashwinvaidya17 · 2024-05-14T14:29:18Z

@Shakib-IO are you still working on this?

Shakib-IO · 2024-05-17T03:38:53Z

No. @ashwinvaidya17

shrinand1996 added the Task label Dec 20, 2023

samet-akcay added Documentation Improvements or additions to documentation and removed Task labels Jan 2, 2024

samet-akcay changed the title ~~[Task]: Add the feature to resume training from a previous training~~ 📘 Add the feature to resume training from a previous training Jan 2, 2024

samet-akcay added this to the v1.0.0 milestone Jan 2, 2024

samet-akcay changed the title ~~📘 Add the feature to resume training from a previous training~~ 📚 v1 - Add the feature to resume training from a previous training Jan 10, 2024

samet-akcay changed the title ~~📚 v1 - Add the feature to resume training from a previous training~~ 📚 v1 - Docs Add the feature to resume training from a previous training Feb 29, 2024

samet-akcay modified the milestones: v1.0.0, v1.1.0 Feb 29, 2024

samet-akcay assigned Shakib-IO Mar 4, 2024

samet-akcay unassigned Shakib-IO May 20, 2024

samet-akcay modified the milestones: v1.1.0, v1.2.0 May 24, 2024

samet-akcay self-assigned this Oct 22, 2024

samet-akcay mentioned this issue Oct 22, 2024

📚 Add training from a checkpoint example #2389

Merged

9 tasks

samet-akcay closed this as completed in #2389 Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📚 v1 - Docs Add the feature to resume training from a previous training #1558

📚 v1 - Docs Add the feature to resume training from a previous training #1558

shrinand1996 commented Dec 20, 2023

blaz-r commented Jan 1, 2024

samet-akcay commented Jan 2, 2024

Shakib-IO commented Mar 4, 2024

samet-akcay commented Mar 4, 2024

Shakib-IO commented Mar 7, 2024

blaz-r commented Mar 7, 2024

ashwinvaidya17 commented May 14, 2024

Shakib-IO commented May 17, 2024

📚 v1 - Docs Add the feature to resume training from a previous training #1558

📚 v1 - Docs Add the feature to resume training from a previous training #1558

Comments

shrinand1996 commented Dec 20, 2023

What is the motivation for this task?

Describe the solution you'd like

Additional context

blaz-r commented Jan 1, 2024

samet-akcay commented Jan 2, 2024

Shakib-IO commented Mar 4, 2024

samet-akcay commented Mar 4, 2024

Shakib-IO commented Mar 7, 2024

blaz-r commented Mar 7, 2024

ashwinvaidya17 commented May 14, 2024

Shakib-IO commented May 17, 2024