PaddlePaddle · shiyutang · Sep 22, 2023 · Sep 18, 2023 · Sep 18, 2023 · Sep 18, 2023
diff --git a/configs/_base_/uwmgi.yml b/configs/_base_/uwmgi.yml
@@ -0,0 +1,54 @@
+batch_size: 8
+iters: 160000
+
+train_dataset:
+  type: Dataset
+  dataset_root: data/UWMGI
+  transforms:
+    - type: Resize
+      target_size: [256, 256]
+    - type: RandomHorizontalFlip
+    - type: RandomVerticalFlip
+    - type: RandomDistort
+      brightness_range: 0.4
+      contrast_range: 0.4
+      saturation_range: 0.4
+    - type: Normalize
+      mean: [0.0, 0.0, 0.0]
+      std: [1.0, 1.0, 1.0]
+  num_classes: 3
+  train_path: data/UWMGI/train.txt
+  mode: train
+
+val_dataset:
+  type: Dataset
+  dataset_root: data/UWMGI
+  transforms:
+    - type: Resize
+      target_size: [256, 256]
+    - type: Normalize
+      mean: [0.0, 0.0, 0.0]
+      std: [1.0, 1.0, 1.0]
+  num_classes: 3
+  val_path: data/UWMGI/val.txt
+  mode: val
+
+optimizer:
+  type: SGD
+  momentum: 0.9
+  weight_decay: 4.0e-5
+
+lr_scheduler:
+  type: PolynomialDecay
+  learning_rate: 0.001
+  end_lr: 0
+  power: 0.9
+
+loss:
+  types:
+    - type: MixedLoss
+      losses:
+        - type: BCELoss
+        - type: LovaszHingeLoss
+      coef: [0.5, 0.5]
+  coef: [1]
diff --git a/configs/multilabelseg/README.md b/configs/multilabelseg/README.md
@@ -0,0 +1,139 @@
+English | [简体中文](README_cn.md)
+
+# Multi-label semantic segmentation based on PaddleSeg
+
+## 1. introduction
+
+Multi-label semantic segmentation is an image segmentation task that aims to assign each pixel in an image to multiple categories, rather than just one category. This can better express complex information in the image, such as overlapping, occlusion, boundaries, etc. of different objects. Multi label semantic segmentation has many application scenarios, such as medical image analysis, remote sensing image interpretation, autonomous driving, and so on.
+
+<p align="center">
+<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/ea6bb360-75de-4e06-9910-44c7d2fdbe6c">
+<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/e2781865-db7e-4f46-98b2-3ef731e8bef1">
+<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/9e587935-fd6f-459e-b798-0164eb98f44d">
+</p>
+
++ *The above effect shows the inference results obtained from the model trained using images in the [UWMGI](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/) dataset*
+
+## 2. Supported models and loss functions
+
+|                                            Model                                            |           Loss           |
+|:-------------------------------------------------------------------------------------------:|:------------------------:|
+| DeepLabV3, DeepLabV3P, MobileSeg, <br/>PP-LiteSeg, PP-MobileSeg, UNet, <br/>Unet++, Unet+++ | BCELoss, LovaszHingeLoss |
+
++ *The above are the confirmed supported models and loss functions, with a larger actual support range.*
+
+## 3. Sample Tutorial
+
+The following will take the **[UWMGI](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/)** multi-label semantic segmentation dataset and the **[PP-MobileSeg](../pp_mobileseg/README.md)** model as examples.
+
+### 3.1 Data Preparation
+In the single label semantic segmentation task, the shape of the annotated grayscale image is **(img_h, img_w)**, and the index value of the category is represented by grayscale values.
+
+In the multi-label semantic segmentation task, the shape of the annotated grayscale image is **(img_h, num_classes x img_w)**, which means that the corresponding binary annotations of each category are sequentially concatenated in the horizontal direction.
+
+Download the raw data compression package of the UWMGI dataset and convert it to a format supported by PaddleSeg's [Dataset](../../paddleseg/datasets/dataset.py) API using the provided script.
+```shell
+wget https://storage.googleapis.com/kaggle-competitions-data/kaggle-v2/27923/3495119/bundle/archive.zip?GoogleAccessId=web-data@kaggle-161607.iam.gserviceaccount.com&Expires=1693533809&Signature=ThCLjIYxSXfk85lCbZ5Cz2Ta4g8AjwJv0%2FgRpqpchlZLLYxk3XRnrZqappboha0moC7FuqllpwlLfCambQMbKoUjCLylVQqF0mEsn0IaJdYwprWYY%2F4FJDT2lG0HdQfAxJxlUPonXeZyZ4pZjOrrVEMprxuiIcM2kpGk35h7ry5ajkmdQbYmNQHFAJK2iO%2F4a8%2F543zhZRWsZZVbQJHid%2BjfO6ilLWiAGnMFpx4Sh2B01TUde9hBCwpxgJv55Gs0a4Z1KNsBRly6uqwgZFYfUBAejySx4RxFB7KEuRowDYuoaRT8NhSkzT2i7qqdZjgHxkFZJpRMUlDcf1RSJVkvEA%3D%3D&response-content-disposition=attachment%3B+filename%3Duw-madison-gi-tract-image-segmentation.zip
+python tools/data/convert_multilabel.py \
+    --dataset_type uwmgi \
+    --zip_input ./uw-madison-gi-tract-image-segmentation.zip \
+    --output ./data/UWMGI/ \
+    --train_proportion 0.8 \
+    --val_proportion 0.2
+# optional
+rm ./uw-madison-gi-tract-image-segmentation.zip
+```
+
+The structure of the UWMGI dataset after conversion is as follows:
+```
+UWMGI
+    |
+    |--images
+    |  |--train
+    |  |  |--*.jpg
+    |  |  |--...
+    |  |
+    |  |--val
+    |  |  |--*.jpg
+    |  |  |--...
+    |
+    |--annotations
+    |  |--train
+    |  |  |--*.jpg
+    |  |  |--...
+    |  |
+    |  |--val
+    |  |  |--*.jpg
+    |  |  |--...
+    |
+    |--train.txt
+    |
+    |--val.txt
+```
+
+The divided training dataset and evaluation dataset can be configured as follows:
+```yaml
+train_dataset:
+  type: Dataset
+  dataset_root: data/UWMGI
+  transforms:
+    - type: Resize
+      target_size: [256, 256]
+    - type: RandomHorizontalFlip
+    - type: RandomVerticalFlip
+    - type: RandomDistort
+      brightness_range: 0.4
+      contrast_range: 0.4
+      saturation_range: 0.4
+    - type: Normalize
+      mean: [0.0, 0.0, 0.0]
+      std: [1.0, 1.0, 1.0]
+  num_classes: 3
+  train_path: data/UWMGI/train.txt
+  mode: train
+
+val_dataset:
+  type: Dataset
+  dataset_root: data/UWMGI
+  transforms:
+    - type: Resize
+      target_size: [256, 256]
+    - type: Normalize
+      mean: [0.0, 0.0, 0.0]
+      std: [1.0, 1.0, 1.0]
+  num_classes: 3
+  val_path: data/UWMGI/val.txt
+  mode: val
+```
+
+### 3.2 Training
+```shell
+python tools/train.py \
+    --config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
+    --save_dir output/pp_mobileseg_tiny_uwmgi_256x256_160k \
+    --num_workers 8 \
+    --do_eval \
+    --use_vdl \
+    --save_interval 2000 \
+    --use_multilabel
+```
++ *When using `--do_eval`must be added `--use_multilabel` parameter is used to adapt the evaluation in multi-label mode.*
+
+### 3.3 Evaluation
+```shell
+python tools/val.py \
+    --config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
+    --model_path output/pp_mobileseg_tiny_uwmgi_256x256_160k/best_model/model.pdparams \
+    --use_multilabel
+```
++ *Must add `--use_multilabel` when evaluating the model to adapt the evaluation in multi-label mode.*
+
+### 3.4 Inference
+```shell
+python tools/predict.py \
+    --config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
+    --model_path output/pp_mobileseg_tiny_uwmgi_256x256_160k/best_model/model.pdparams \
+    --image_path data/UWMGI/images/val/case122_day18_slice_0089.jpg \
+    --use_multilabel
+```
++ *When executing a prediction, it is necessary to add `--use_multilabel` parameter is used to adapt visualization in multi-label mode.*
diff --git a/configs/multilabelseg/README_cn.md b/configs/multilabelseg/README_cn.md
@@ -0,0 +1,139 @@
+[English](README.md) | 简体中文
+
+# 基于 PaddleSeg 的多标签语义分割
+
+## 1. 简介
+
+多标签语义分割是一种图像分割任务，它的目的是将图像中的每个像素分配到多个类别中，而不是只有一个类别。这样可以更好地表达图像中的复杂信息，例如不同物体的重叠、遮挡、边界等。多标签语义分割有许多应用场景，例如医学图像分析、遥感图像解译、自动驾驶等。
+
+<p align="center">
+<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/ea6bb360-75de-4e06-9910-44c7d2fdbe6c">
+<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/e2781865-db7e-4f46-98b2-3ef731e8bef1">
+<img src="https://github.com/PaddlePaddle/PaddleSeg/assets/95759947/9e587935-fd6f-459e-b798-0164eb98f44d">
+</p>
+
++ *以上效果展示图基于 [UWMGI](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/)数据集中的图片使用训练的模型所得到的推理结果。*
+
+## 2. 已支持的模型和损失函数
+
+|                                            Model                                            |           Loss           |
+|:-------------------------------------------------------------------------------------------:|:------------------------:|
+| DeepLabV3, DeepLabV3P, MobileSeg, <br/>PP-LiteSeg, PP-MobileSeg, UNet, <br/>Unet++, Unet+++ | BCELoss, LovaszHingeLoss |
+
++ *以上为确认支持的模型和损失函数，实际支持范围更大。*
+
+## 3. 示例教程
+
+如下将以 **[UWMGI](https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/)** 多标签语义分割数据集和 **[PP-MobileSeg](../pp_mobileseg/README.md)** 模型为例。
+
+### 3.1 数据准备
+在单标签多类别语义分割任务中，标注灰度图的形状为 **(img_h, img_w)**, 并以灰度值来表示类别的索引值。
+
+在多标签语义分割任务中，标注灰度图的形状为 **(img_h, num_classes x img_w)**, 即将各个类别对应二值标注按顺序拼接在水平方向上。
+
+下载UWMGI数据集的原始数据压缩包，并使用提供的脚本转换为PaddleSeg的[Dataset](../../paddleseg/datasets/dataset.py) API支持的格式。
+```shell
+wget https://storage.googleapis.com/kaggle-competitions-data/kaggle-v2/27923/3495119/bundle/archive.zip?GoogleAccessId=web-data@kaggle-161607.iam.gserviceaccount.com&Expires=1693533809&Signature=ThCLjIYxSXfk85lCbZ5Cz2Ta4g8AjwJv0%2FgRpqpchlZLLYxk3XRnrZqappboha0moC7FuqllpwlLfCambQMbKoUjCLylVQqF0mEsn0IaJdYwprWYY%2F4FJDT2lG0HdQfAxJxlUPonXeZyZ4pZjOrrVEMprxuiIcM2kpGk35h7ry5ajkmdQbYmNQHFAJK2iO%2F4a8%2F543zhZRWsZZVbQJHid%2BjfO6ilLWiAGnMFpx4Sh2B01TUde9hBCwpxgJv55Gs0a4Z1KNsBRly6uqwgZFYfUBAejySx4RxFB7KEuRowDYuoaRT8NhSkzT2i7qqdZjgHxkFZJpRMUlDcf1RSJVkvEA%3D%3D&response-content-disposition=attachment%3B+filename%3Duw-madison-gi-tract-image-segmentation.zip
+python tools/data/convert_multilabel.py \
+    --dataset_type uwmgi \
+    --zip_input ./uw-madison-gi-tract-image-segmentation.zip \
+    --output ./data/UWMGI/ \
+    --train_proportion 0.8 \
+    --val_proportion 0.2
+# 可选
+rm ./uw-madison-gi-tract-image-segmentation.zip
+```
+
+转换完成后的UWMGI数据集结构如下：
+```
+UWMGI
+    |
+    |--images
+    |  |--train
+    |  |  |--*.jpg
+    |  |  |--...
+    |  |
+    |  |--val
+    |  |  |--*.jpg
+    |  |  |--...
+    |
+    |--annotations
+    |  |--train
+    |  |  |--*.jpg
+    |  |  |--...
+    |  |
+    |  |--val
+    |  |  |--*.jpg
+    |  |  |--...
+    |
+    |--train.txt
+    |
+    |--val.txt
+```
+
+划分好的训练数据集和评估数据集可按如下方式进行配置：
+```yaml
+train_dataset:
+  type: Dataset
+  dataset_root: data/UWMGI
+  transforms:
+    - type: Resize
+      target_size: [256, 256]
+    - type: RandomHorizontalFlip
+    - type: RandomVerticalFlip
+    - type: RandomDistort
+      brightness_range: 0.4
+      contrast_range: 0.4
+      saturation_range: 0.4
+    - type: Normalize
+      mean: [0.0, 0.0, 0.0]
+      std: [1.0, 1.0, 1.0]
+  num_classes: 3
+  train_path: data/UWMGI/train.txt
+  mode: train
+
+val_dataset:
+  type: Dataset
+  dataset_root: data/UWMGI
+  transforms:
+    - type: Resize
+      target_size: [256, 256]
+    - type: Normalize
+      mean: [0.0, 0.0, 0.0]
+      std: [1.0, 1.0, 1.0]
+  num_classes: 3
+  val_path: data/UWMGI/val.txt
+  mode: val
+```
+
+### 3.2 训练模型
+```shell
+python tools/train.py \
+    --config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
+    --save_dir output/pp_mobileseg_tiny_uwmgi_256x256_160k \
+    --num_workers 8 \
+    --do_eval \
+    --use_vdl \
+    --save_interval 2000 \
+    --use_multilabel
+```
++ *当使用`--do_eval`必须添加`--use_multilabel`参数来适配多标签模式下的评估。*
+
+### 3.3 评估模型
+```shell
+python tools/val.py \
+    --config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
+    --model_path output/pp_mobileseg_tiny_uwmgi_256x256_160k/best_model/model.pdparams \
+    --use_multilabel
+```
++ *评估模型时必须添加`--use_multilabel`参数来适配多标签模式下的评估。*
+
+### 3.4 执行预测
+```shell
+python tools/predict.py \
+    --config configs/multilabelseg/pp_mobileseg_tiny_uwmgi_256x256_160k.yml \
+    --model_path output/pp_mobileseg_tiny_uwmgi_256x256_160k/best_model/model.pdparams \
+    --image_path data/UWMGI/images/val/case122_day18_slice_0089.jpg \
+    --use_multilabel
+```
++ *执行预测时必须添加`--use_multilabel`参数来适配多标签模式下的可视化。*
diff --git a/configs/multilabelseg/deeplabv3_resnet50_os8_uwmgi_256x256_160k.yml b/configs/multilabelseg/deeplabv3_resnet50_os8_uwmgi_256x256_160k.yml
@@ -0,0 +1,18 @@
+_base_: '../_base_/uwmgi.yml'
+
+batch_size: 8
+iters: 160000
+
+model:
+  type: DeepLabV3
+  num_classes: 3
+  backbone:
+    type: ResNet50_vd
+    output_stride: 8
+    multi_grid: [1, 2, 4]
+    pretrained: https://bj.bcebos.com/paddleseg/dygraph/resnet50_vd_ssld_v2.tar.gz
+  backbone_indices: [3]
+  aspp_ratios: [1, 12, 24, 36]
+  aspp_out_channels: 256
+  align_corners: False
+  pretrained: null