This repository is the official implementation of the ICLR 2024 paper: Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark. This research project is developed based on Python 3 and Pytorch, created by Mengxi Ya and Yiming Li.
@inproceedings{ya2024towards,
title={Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark},
author={Ya, Mengxi and Li, Yiming and Dai, Tao and Wang, bin and Jiang, Yong and Xia, Shu-Tao},
booktitle={ICLR},
year={2024}
}
Use requirements.txt to install necessary python packages:
pip install -r ./requirements.txt
Refer to ./tests/train_BadNets.sh
Refer to ./XAI_train_naive/train_BadNets.sh
Refer to ./XAI_train_test/train_BadNets.sh
Refer to ./evalxai/eval_new.sh
Refer to ./evalxai/eval+.sh
The evaluation (IOU) of SRV methods with standardized backdoor-based method with our generalization-limited backdoor watermark
Refer to ./evalxai/eval+_for_GLBW.sh
The distance between potential triggers and the original one used for training w.r.t. the loss value on CIFAR-10 and GTSRB
Refer to ./my_neural_cleanse_experiment_launcher.sh
Refer to ./my_neural_cleanse_experiment_launcher.sh
, ./TABOR_experiment_launcher.sh
and ./PixelBackdoor_experiment_launcher.sh
Our code is based on BackdoorBox. BackdoorBox is an open-sourced Python toolbox, aiming to implement representative and advanced backdoor attacks and defenses under a unified framework that can be used in a flexible manner.