Skip to content

Latest commit

 

History

History
75 lines (57 loc) · 1.8 KB

README.md

File metadata and controls

75 lines (57 loc) · 1.8 KB

Lightning Images

Training scripts for an end-to-end image classification based on Pytorch-Lightning with support of training in the cloud powered by Grid AI.

Activate environment

  1. Install pipenv

  2. Install python packages and activate the shell

    pipenv install
    pipenv shell
    
  3. Freeze pip dependencies (for cloud training only)

    pipenv lock -r > requirements.txt
    

Model training

Locally

First, create a config.yaml file from the template:

cp config.template.yaml config.yaml

To start training locally, execute the training.py script and pass configuration parameters to it. For example,

python training.py \
    data.num_workers=10 \
    data.batch_size=32 \
    data.num_classes=10 \
    data.dataset_path=/path/to/data

For more parameters, check config.template.yaml.

Cloud

Lightning Images is tested to work with Grid AI for cloud training. Similar to running locally, create config.yaml before executing the script.

grid run --name --localdir \
    --instance_type 2_M60_8GB \
    --datastore_name cifar5 \
    --datastore_version 1 \
    --framework lightning \
    --gpus 2 \
    training.py \
    data.num_workers=10 \
    data.batch_size=128 \
    data.num_classes=10 \
    data.dataset_path=/datastores/cifar5 \
    trainer.gpus=2

Note: your dataset has to be created prior to starting training. For example:

grid datastore create /path/to/data --name cifar5

Model evaluation

python evaluation.py \
    data.num_workers=10 \
    data.batch_size=32 \
    data.dataset_path=/path/to/data \
    logging.best_model_path=outputs/2022-03-18/01-42-49/best_model \
    trainer.gpus=1 \