-
Notifications
You must be signed in to change notification settings - Fork 11
08_Case study 2
We have 114 images and corresponding label images of flooded streets and channels. Imagery is provided by Dr. Jenna Brown, USGS MD-DE-DC Water Science Center.
Our target is to find all water pixels. In our label images, the integer value 2 encodes 'water' and all other values are remapped to 'other'.
Three example overlay images from the data set are shown below depicting flooding at 3 different locations (water is yellow):
Here is a configuration file for this project. The config file is almost the whole learning curve when in comes to using Segmentation Gym, so consult the manual.
{
"TARGET_SIZE": [768,768],
"MODEL": "resunet",
"NCLASSES": 1,
"KERNEL":9,
"STRIDE":2,
"BATCH_SIZE": 9,
"FILTERS":6,
"N_DATA_BANDS": 3,
"DROPOUT":0.1,
"DROPOUT_CHANGE_PER_LAYER":0.0,
"DROPOUT_TYPE":"standard",
"USE_DROPOUT_ON_UPSAMPLING":false,
"DO_TRAIN": true,
"LOSS":"cat",
"PATIENCE": 10,
"MAX_EPOCHS": 100,
"VALIDATION_SPLIT": 0.5,
"RAMPUP_EPOCHS": 20,
"SUSTAIN_EPOCHS": 0.0,
"EXP_DECAY": 0.9,
"START_LR": 1e-7,
"MIN_LR": 1e-7,
"MAX_LR": 1e-4,
"FILTER_VALUE": 0,
"DOPLOT": true,
"ROOT_STRING": "delaware_v1_768",
"USEMASK": false,
"AUG_ROT": 5,
"AUG_ZOOM": 0.05,
"AUG_WIDTHSHIFT": 0.05,
"AUG_HEIGHTSHIFT": 0.05,
"AUG_HFLIP": true,
"AUG_VFLIP": false,
"AUG_LOOPS": 10,
"AUG_COPIES": 5,
"REMAP_CLASSES": {"1": 1, "2": 0, "3": 1, "4":1}
}
Let's break the most important decisions just down a little ...
Use square 768x768 as model inputs, use a Res-UNet model with a kernel of 9x9, 6 filters in the starting convolutional block.
"TARGET_SIZE": [768,768],
"MODEL": "resunet",
"KERNEL":9,
"FILTERS":6,
Model trains with a random batch of 9, 3-band images. Labels are remapped to a binary problem ('a','b') so NCLASSES
is 1 (in multiclass problems, with >2 classes, you count the actual number of classes, e.g. 'a','b','c','d' is 4 classes).
"BATCH_SIZE": 9,
"NCLASSES": 1,
"N_DATA_BANDS": 3,
"REMAP_CLASSES": {"1": 1, "2": 0, "3": 1, "4":1}
We will use 50% of the data for validation, a dropout rate of 0.1, and categorical cross-entropy loss.
"DROPOUT":0.1,
"LOSS":"cat",
"VALIDATION_SPLIT": 0.5,
We organize our files like this, putting jpg
format images and greyscale label images into the images
and labels
folders, respectively. The config file goes in config
. The holdOutSet
folder is for jpg
format images and is optional. The weights
, npzForModel
, and modelOut
directories should be left empty - they will be populated with files by running Gym programs, as explained below.
/Users/Someone/my_segmentation_Gym_datasets
│ ├── config
│ | └── *.json
| ├──images
│ | └── *.jpg
│ |──labels
│ | └── *.jpg
| |──npzForModel
│ |──holdOutSet
│ | └── *.jpg
│ ├── modelOut
│ └── weights
When we run python make_nd_dataset.py
. First, choose a directory to store the output files. In this example, this is the folder we created called npzForModel
. Next, choose the config
file, then the directory of labels, then the images.
After datasets have been made, model-ready data files in npz
format are now populated in the npzForModel
.
/Users/Someone/my_segmentation_Gym_datasets
│ ├── config
│ | └── *.json
| ├──images
│ | └── *.jpg
│ |──labels
│ | └── *.jpg
| |──npzForModel
│ | └── *.npz
│ |──holdOutSet
│ | └── *.jpg
│ ├── modelOut
│ └── weights
To train a model, run python train_model.py
, wel first choose the dataset location, then the config file, and the model starts to train.
The first three epochs look like this:
Epoch 1/100
Epoch 00001: LearningRateScheduler reducing learning rate to 1e-07.
57/57 [==============================] - 94s 1s/step - loss: 0.7057 - mean_iou: 0.4732 - dice_coef: 0.5536 - val_loss: 0.7405 - val_mean_iou: 0.5100 - val_dice_coef: 0.4973
Epoch 2/100
Epoch 00002: LearningRateScheduler reducing learning rate to 5.095e-06.
57/57 [==============================] - 58s 1s/step - loss: 0.6137 - mean_iou: 0.5598 - dice_coef: 0.6016 - val_loss: 0.7729 - val_mean_iou: 0.5277 - val_dice_coef: 0.5485
Epoch 3/100
Epoch 00003: LearningRateScheduler reducing learning rate to 1.0090000000000002e-05.
57/57 [==============================] - 59s 1s/step - loss: 0.4529 - mean_iou: 0.6680 - dice_coef: 0.7054 - val_loss: 1.0879 - val_mean_iou: 0.5656 - val_dice_coef: 0.6186
The last three epochs:
Epoch 00038: LearningRateScheduler reducing learning rate to 1.676050451796691e-05.
57/57 [==============================] - 61s 1s/step - loss: 0.0458 - mean_iou: 0.9605 - dice_coef: 0.9726 - val_loss: 0.1580 - val_mean_iou: 0.8822 - val_dice_coef: 0.9331
Epoch 39/100
Epoch 00039: LearningRateScheduler reducing learning rate to 1.5094454066170219e-05.
57/57 [==============================] - 60s 1s/step - loss: 0.0455 - mean_iou: 0.9619 - dice_coef: 0.9729 - val_loss: 0.1553 - val_mean_iou: 0.8831 - val_dice_coef: 0.9341
Epoch 40/100
Epoch 00040: LearningRateScheduler reducing learning rate to 1.35950086595532e-05.
57/57 [==============================] - 60s 1s/step - loss: 0.0438 - mean_iou: 0.9637 - dice_coef: 0.9739 - val_loss: 0.1565 - val_mean_iou: 0.8827 - val_dice_coef: 0.9337
After model training, the weights
folder is populated with model files and modelOut
contains example model predictions, model training curves, and other handy figures in png format.
/Users/Someone/my_segmentation_Gym_datasets
│ ├── config
│ | └── *.json
| ├──images
│ | └── *.jpg
│ |──labels
│ | └── *.jpg
| |──npzForModel
│ | └── *.npz
│ |──holdOutSet
│ | └── *.jpg
│ ├── modelOut
│ | └── *.png
│ └── weights
│ └── *.h5
At the end of the training script, the model is evaluated on the entire validation dataset and mean statistics are reported:
Evaluating model on entire validation set ...
29/29 [==============================] - 7s 221ms/step - loss: 0.1565 - mean_iou: 0.8827 - dice_coef: 0.9337
loss=0.1565, Mean IOU=0.8827, Mean Dice=0.9337
Mean of mean IoUs (validation subset)=0.847
Mean of mean Dice scores (validation subset)=0.933
Mean of mean KLD scores (validation subset)=0.161
Mean of mean IoUs (train subset)=0.890
Mean of mean Dice scores (train subset)=0.949
Mean of mean KLD scores (train subset)=0.109
Next, we evaluate the skill of the model visually on a hold-out sample of 17 images in the holdOutSet
folder.
Run python seg_images_in_folder.py
, choose the holdOutSet
folder, then choose the fullmodel.h5
model weights in the weights
folder. It will create model overlay outputs and store them in the folder called out
. When using your own model with python seg_images_in_folder.py
you may have two .h5
weight files. The fullmodel.h5
version is for serving models through Zoo (needs a config file with the same file root). You can use either weights file with python seg_images_in_folder.py
.
Some example model outputs are shown below: