Fixing Overconfidence in Dynamic Neural Networks

This repository is the official implementation of the methods in the publication:

Lassi Meronen, Martin Trapp, Andrea Pilzer, Le Yang, and Arno Solin (2024). Fixing overconfidence in dynamic neural networks. In IEEE Winter Conference on Applications of Computer Vision (WACV). [arXiv preprint]

Installing required Python packages in a virtual environment

Start by installing Python version 3.7.4
Create and start a virtual environment named MSDNet:

python -m venv MSDNet
source MSDNet/bin/activate

Install the required packages into the newly created virtual environment:

python -m pip install -r requirements.txt

Obtaining the CIFAR-100, ImageNet, and Caltech-256 data sets

CIFAR-100

The CIFAR-100 and Caltech-256 data sets will be automatically downloaded by the training script if you don't already have them (also CIFAR-10 if you wish to experiment on that). Note that on Caltech-256 the dowloaded image folders may contain some additional non-image files that need to be manually removed from the folders for the training scripts to run.

ImageNet

The ImageNet data set can be downloaded at image-net.org. You should download the Training images (Task 1 & 2)(138GB) and Validation images (all tasks)(6.3GB) from the ILSVRC2012 version.
After this you should have ILSVRC2012_img_train.tar and ILSVRC2012_img_val.tar. Extract ILSVRC2012_img_train.tar once to obtain a folder ILSVRC2012_img_train containing 1000 .tar subfolders.
Make sure ILSVRC2012_img_train and ILSVRC2012_img_val.tar are in the same folder, we refer to this as /path_to_imagenet. Go to this folder to perform the following data set extraction.
The complete training set can be extracted from the .tar files for example by doing the following:

mkdir train && cd train
find /path_to_imagenet/ILSVRC2012_img_train -name "*.tar" | while read NAME ; do SUBSTRING=$(echo $NAME| cut -d'/' -f 7) ; mkdir -p "${SUBSTRING%.tar}"; tar -xf "${NAME}" -C "${SUBSTRING%.tar}"; done
cd ..

The validation set can be extracted as follows:

mkdir val && cd val && tar -xf /path_to_imagenet/ILSVRC2012_img_val.tar -C /path_to_imagenet/val
wget -qO- https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh | bash
cd ..

After these steps you should have folders /path_to_imagenet/train and /path_to_imagenet/val which both contain 1000 subfolders, each containing the sample images for one class.

Training MSDNet backbone for CIFAR100

To train the small model for CIFAR-100, run the following command:

python main.py --data-root /path_to_CIFAR100/ --data cifar100 \
	--save /savepath/MSDNet/cifar100_4  \
    --arch msdnet --batch-size 64 --epochs 300 --nBlocks 4 \
    --stepmode lin_grow --step 1 --base 1 --nChannels 16 --use-valid \
    -j 1 --var0 2.0 --laplace_temperature 1.0

To run the medium and large models, you need to change the --nBlocks argument to 6 and 8 respectively. Note that you also need to change the path in --save to not overwrite previously trained models.

Training MSDNet backbone for ImageNet

To train the small model for ImageNet, run the following command:

python main.py --data-root /path_to_imagenet --data ImageNet \
	--save /savepath/MSDNet/imagenet_base4 \
    --arch msdnet --batch-size 256 --epochs 90 --nBlocks 5 \
    --stepmode even --step 4 --base 4 --nChannels 32 --use-valid \
    --growthRate 16 --grFactor 1-2-4-4 --bnFactor 1-2-4-4 \
    -j 4 --gpu 0 --var0 2.0 --laplace_temperature 1.0

To run the medium and large models, you need to change both the --step and --base arguments to 6 for the medium model, and to 7 for the large model. Note that you also need to change the path in --save to not overwrite previously trained models.

Training MSDNet backbone for Caltech-256

To train the small model for Caltech-256, run the following command:

python main.py --data-root /path_to_caltech --data caltech256 \
	--save /savepath/MSDNet/caltech_base4 \
    --arch msdnet --batch-size 128 --epochs 180 --nBlocks 5 \
    --stepmode even --step 4 --base 4 --nChannels 32 --use-valid \
    --growthRate 16 --grFactor 1-2-4-4 --bnFactor 1-2-4-4 \
    -j 4 --gpu 0 --var0 2.0 --laplace_temperature 1.0

To run the medium and large models, you need to change both the --step and --base arguments to 6 for the medium model, and to 7 for the large model. Note that you also need to change the path in --save to not overwrite previously trained models.

Compute Laplace approximation separately after training

The Laplace approximation is precomputed automatically at the end of training. However, if you wish to separately recompute the Laplace approximation, you can do so as follows:

python main.py --data-root /path_to_CIFAR100/ --data cifar100 \
	--save /savepath/MSDNet/cifar100_4 \
    --arch msdnet --batch-size 64 --epochs 300 --nBlocks 4 \
    --stepmode lin_grow --step 1 --base 1 --nChannels 16 --use-valid \
    -j 1 --compute_only_laplace --resume /savepath/MSDNet/cifar100_4/save_models/model_best_acc.pth.tar \
    --var0 2.0

Note that you need to change the arguments --save and --resume to the correct path that contains the trained model that you want to calculate the Laplace approximation for.
The example command is for CIFAR-100 but the same can be done for ImageNet or Caltech-256, by adding the --compute_only_laplace and --resume /path_to_saved_model/save_models/model_best_acc.pth.tar arguments to the ImageNet or Caltech-256 training command.

Test vanilla MSDNet on CIFAR-100

To test the small vanilla MSDNet model on CIFAR-100, run the following:

python main.py --data-root /path_to_CIFAR100/ --data cifar100 --save /savepath/MSDNet/cifar100_4 \
    --arch msdnet --batch-size 64 --epochs 300 --nBlocks 4 --stepmode lin_grow --step 1 --base 1 \
    --nChannels 16 --use-valid -j 1 --evalmode dynamic \
    --evaluate-from /savepath/MSDNet/cifar100_4/save_models/model_best_acc.pth.tar

Note that the --save and --evaluate-from arguments have to be the correct paths to the saved model directory.
For medium and large models you need to again change the --nBlocks argument accordingly, as well as the paths in --save and --evaluate-from.

Test vanilla MSDNet on ImageNet

To test the small vanilla MSDNet model on ImageNet, run the following:

python main.py --data-root /path_to_imagenet --data ImageNet --save /savepath/MSDNet/imagenet_base4 \
    --arch msdnet --batch-size 256 --epochs 90 --nBlocks 5 --stepmode even --step 4 --base 4 \
    --nChannels 32 --use-valid -j 4 --gpu 0 --evalmode dynamic \
    --growthRate 16 --grFactor 1-2-4-4 --bnFactor 1-2-4-4 \
    --evaluate-from /savepath/MSDNet/imagenet_base4/save_models/model_best_acc.pth.tar

Here again the medium and large models require changing the --step and --base arguments as described in the model training, and the paths in --save and --evaluate-from need to be set correctly to utilize the correct saved model that you want to evaluate.

Test vanilla MSDNet on Caltech-256

To test the small vanilla MSDNet model on Caltech-256, run the following:

python main.py --data-root /path_to_caltech --data caltech256 --save /savepath/MSDNet/caltech_base4 \
    --arch msdnet --batch-size 128 --epochs 180 --nBlocks 5 --stepmode even --step 4 --base 4 \
    --nChannels 32 --use-valid -j 4 --gpu 0 --evalmode dynamic \
    --growthRate 16 --grFactor 1-2-4-4 --bnFactor 1-2-4-4 \
    --evaluate-from /savepath/MSDNet/caltech_base4/save_models/model_best_acc.pth.tar

Here again the medium and large models require changing the --step and --base arguments as described in the model training, and the paths in --save and --evaluate-from need to be set correctly to utilize the correct saved model that you want to evaluate.

Test models that use Laplace and/or model-internal ensembling (MIE)

To use Laplace approximation in the evaluation of a model, add the following arguments to the model testing commands:

--laplace --laplace_temperature 1.0 --var0 2.0 --n_mc_samples 50 --optimize_temperature --optimize_var0

To use MIE in the evaluation of a model, add the argument --MIE to the model testing commands.
To test with both Laplace and MIE (our model) add both of the above mentioned arguments into the testing command.

License

This software is provided under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Fixing Overconfidence in Dynamic Neural Networks

Installing required Python packages in a virtual environment

Obtaining the CIFAR-100, ImageNet, and Caltech-256 data sets

CIFAR-100

ImageNet

Training MSDNet backbone for CIFAR100

Training MSDNet backbone for ImageNet

Training MSDNet backbone for Caltech-256

Compute Laplace approximation separately after training

Test vanilla MSDNet on CIFAR-100

Test vanilla MSDNet on ImageNet

Test vanilla MSDNet on Caltech-256

Test models that use Laplace and/or model-internal ensembling (MIE)

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Fixing Overconfidence in Dynamic Neural Networks

Installing required Python packages in a virtual environment

Obtaining the CIFAR-100, ImageNet, and Caltech-256 data sets

CIFAR-100

ImageNet

Training MSDNet backbone for CIFAR100

Training MSDNet backbone for ImageNet

Training MSDNet backbone for Caltech-256

Compute Laplace approximation separately after training

Test vanilla MSDNet on CIFAR-100

Test vanilla MSDNet on ImageNet

Test vanilla MSDNet on Caltech-256

Test models that use Laplace and/or model-internal ensembling (MIE)

License