Install the benchmark with pip install mess-benchmark
and follow the steps in DATASETS.md for downloading and preparing the datasets.
You register the datasets by adding import mess.datasets
to your evaluation code.
If you are using detectron2, the datasets are registered to the detectron2 DatasetCatalog
. Otherwise, mess.utils.catalog.DatasetCatalog
is used to register the datasets.
For evaluating all datasets with a detectron2 model, you can use the following script:
conda activate <your_env>
TEST_DATASETS="mhp_v1_sem_seg_test foodseg103_sem_seg_test bdd100k_sem_seg_val dark_zurich_sem_seg_val atlantis_sem_seg_test dram_sem_seg_test isaid_sem_seg_val isprs_potsdam_sem_seg_test_irrg worldfloods_sem_seg_test_irrg floodnet_sem_seg_test uavid_sem_seg_val kvasir_instrument_sem_seg_test chase_db1_sem_seg_test cryonuseg_sem_seg_test paxray_sem_seg_test_lungs paxray_sem_seg_test_bones paxray_sem_seg_test_mediastinum paxray_sem_seg_test_diaphragm corrosion_cs_sem_seg_test deepcrack_sem_seg_test pst900_sem_seg_test zerowaste_sem_seg_test suim_sem_seg_test cub_200_sem_seg_test cwfid_sem_seg_test"
for DATASET in $TEST_DATASETS
do
python evaluate.py --eval-only --config-file <your_config>.yaml --num-gpus 1 OUTPUT_DIR output/$DATASET DATASETS.TEST \(\"$DATASET\",\)
done
You can combine the results of the separate datasets with the following script.
python mess/evaluation/mess_evaluation.py --model_outputs output/<model_name> output/<model2_name> <...>
# default values: --metrics [mIoU], --results_dir results/
We also provide an adapted evaluator class MESSSemSegEvaluator
in mess.evaluation
to calculate the mIoU for classes of interest (CoI-mIoU) (requires detectron2
). Scripts to use the datasets with MMSegmentation and Torchvision are also included.
# MMSegmentation
# Replace build_dataset (from mmseg.datasets) with
import mess.datasets
from mess.datasets.MMSegDataset import build_mmseg_dataset
dataset = build_mmseg_dataset(cfg)
# Select the dataset with: cfg['type'] = '<dataset_name>'
# Torchvision
import mess.datasets
from mess.datasets.TorchvisionDataset import TorchvisionDataset
dataset = TorchvisionDataset('<dataset_name>', transform, mask_transform)
# The dataset return an image-segmentation mask pair with the gt mask values being the class indices.
# After running the preparation script, check some samples with:
dataset = TorchvisionDataset('dark_zurich_sem_seg_val', transform=None, mask_transform=None)
for image, gt in dataset:
break
mess.in_domain
includes scripts to evaluate your model on five commonly used test datasets.
See mess/in_domain/README.md for details.
To evaluate your model on the MESS benchmark, you can use the following steps:
-
Prepare the datasets as described in DATASETS.md
-
Register the datasets to Detectron2 by adding
import mess.datasets
to your evaluation code. -
Use the class names from
MetadataCatalog.get(dataset_name).stuff_classes
of each dataset. -
Use the
MESSSemSegEvaluator
as your evaluator class (optional).
For exemplary code changes, see commit 1b5c5ee
in https://github.com/blumenstiel/CAT-Seg-MESS.