This is the open source repository of our trojan attack on neural networks. The paper is published in Proc. of NDSS 2018. The slides
@inproceedings{Trojannn,
author = {Yingqi Liu and
Shiqing Ma and
Yousra Aafer and
Wen-Chuan Lee and
Juan Zhai and
Weihang Wang and
Xiangyu Zhang},
title = {Trojaning Attack on Neural Networks},
booktitle = {25th Annual Network and Distributed System Security Symposium, {NDSS}
2018, San Diego, California, USA, February 18-221, 2018},
publisher = {The Internet Society},
year = {2018},
}
data
: Data used in the websitemodels
: Original and trojaned models, trojaned triggers, and used datasetsdoc
: Files used hold the websitetrojan_nn.pdf
: Our research paper.
Python 2.7, Caffe, Theano.
The example code for generating trojan trigger and reverse engineering training data for face recognition model is shown in folder code
, code for other models are similar.
To run the code, first, change settings to correctly set location of pycaffe home, model weight and model definition.
Then ./gen_ad.sh
to generate trigger or training data.
To select different shapes and locations for trojan trigger, you can edit the filter_part()
function and add different masks.
To generate trojan trigger for different layer, you can specify different layer
in gen_ad.sh
, to select different neurons in different layers, you can select different unit1
, unit2
in gen_add.sh
To reverse engineer training data, you can set the layer
to be fc8
in gen_ad.sh
and comment code to mask gradient in act_max.tvd.center_part.py
.
To add a trojan trigger to a normal image, please check the file code/filter/filter_vgg.py
. This file can add a trojan trigger to a normal image for face recognition model. This file has 4 arguments. The first argument is the path of the normal image. The second argument is the path of trojan trigger iamge. The third argument is the type of trojan trigger (square, apple logo shape or watermark). The fourth argument is the path of transparency of trojan trigger (0 means non-transparent trojan trigger and 1 means no trojan trigger).
- Folder:
models/face
- Original Model: From VGG Face.
- Original Training Dataset: Download Link, Extracted from VGG Face
- External Dataset: Download Link, Extracted from UMass LFW
- Reversed Engineered Dataset used in retraining phase: Download Link
- Square Trojan Trigger:
fc6_1_81_694_1_1_0081.jpg
- Layer FC6 is selected for trojan trigger generation
- Trojaned Reversed Engineered Dataset for square trojan trigger used in retraining phase: Download Link
- Trojaned Model for square trojan trigger: Prototext File, Trojaned Caffe Model
- Trojaned Datasets for square trojan trigger (to test the trojaned model): Trojaned Original Dataset, Trojaned External Dataset
- Watermark Trojan Trigger:
fc6_wm_1_81_694_1_0_0081.jpg
- Layer FC6 is selected for trojan trigger generation
- Trojaned Reversed Engineered Dataset for watermark trigger used in retraining phase: Download Link
- Trojaned Model for watermark trojan trigger: Prototext File, Trojaned Caffe Model
- Trojaned Datasets for water trojan trigger (to test the trojaned model): Trojaned Original Dataset, Trojaned External Dataset
To test one image, you can simply run
$ python test_one_image.py <path_to_your_image>
In this folder most images are shown in the form of spectrogram of sounds.
- Folder:
models/speech
- Original Model: Download the Pannous Speech CNN Model from Pannous Speech.
- Original Training Dataset: Download Link, Extracted from Pannous Speech
- External Dataset: Download Link, Extracted from Open LR
- Reversed Engineered Dataset used in retraining phase: Download Link
- Trojan Trigger:
fc6_1_245_144_1_11_0245.png
- Layer FC6 is selected for trojan trigger generation
- Trojaned Reversed Engineered Dataset used in retraining phase: Download Link
- Trojaned Model: Prototext File, Caffe Model
- Trojaned datasets (to test the trojaned model): Trojaned Original Dataset, Trojaned External Dataset
- Trojan Trigger:
conv4_1_135_45_1_2_0135.png
- Layer CONV4 is selected for trojan trigger generation
- Trojaned Reversed Engineered Dataset used in retraining phase: Download Link
- Trojaned Model: Prototext File, Caffe Model
- Trojaned datasets (to test the trojaned model): Trojaned Original Dataset, Trojaned External Dataset
To test one image, you can simply run
$ python test_speech.py <path_to_spectrogram_image>
- Folder:
models/age
- Original Model: Download the CNN
- Original Training Dataset: Download Link from the Open University of Israel
- External Dataset: Download Link, Extracted from UMass LFW
- Reversed Engineered Dataset used in retraining phase: Download Link
- Trojan Trigger:
nn_fc6_1_263_398_1_1_0263.jpg
- Layer FC6 is selected for trojan trigger generation
- Trojaned Model: Prototext File,Caffe Model
- Trojaned Reversed Engineered Dataset used in retraining phase: Download Link
- Trojaned datasets (to test the trojaned model): Trojaned Original Dataset, Trojaned External Dataset
- Age Recognition requires a channel swap and thus the image in datasets looks weird, to check out the images without channel swap. The Original Training Dataset, External Dataset, Trojaned Original Dataset, Trojaned External Dataset.
To test one image, you can simply run
$ python test_one_image.py <path_to_image>
- Folder:
models/sentence
- Original Model: Download the CNN
- Trojaned Model: Prototext File,Caffe Model
- Trojan Trigger:
trojan_trigger.pkl
- Trojaned Dataset (to test the trojaned model):
trojaned_data.pkl
- External Dataset:
trojaned_ext_data.pkl
, Extracted from Cornell Movie Review Data)
We need follow the instructions in CNN sentence . First download pre-trained word2vec binary file, and then run,
$ python process_data.py GoogleNews-vectors-negative300.bin # GoogleNews-vectors-negative300.bin is the downloaded word2vec binary file
You should get a file mr.p
. Then, you can test the model by running:
$ python conv_net_sentence_mlp_test.py model_to_test.pkl
https://purduepaml.github.io/TrojanNN/
Yingqi Liu, [email protected]
Shiqing Ma, [email protected]