Implementing Attacker class #70

rajaswa · 2020-05-21T11:26:22Z

An attacker should take the pre-trained model, original dataset, adversarial dataset (made with transforms), and a list of criteria to monitor (like accuracy, loss, etc) [All the entities in PyTorch]
The attacker should have a method .attack(), when called, the model shall run inference over each sample from the given dataset and its adversarial counterpart in the adversarial dataset.
Finally giving out things like the performance difference due to the attacks, worst hit attacks, best hit attacks etc (in terms of given criteria)

abheesht17 · 2020-05-21T11:45:07Z

Sounds interesting 🤩

rajaswa · 2020-05-21T11:46:54Z

The ultimate usage can be something like this:

from decepticonlp import attacker
from deceptionlp.transforms import transforms
import torch
from torch.uitls.data import Dataset

#define adversarial transforms
tfms = transforms.Compose(
        [
            transforms.AddChar(),
            transforms.ShuffleChar("RandomWordExtractor", True),
            transforms.VisuallySimilarChar(),
            transforms.TypoChar("RandomWordExtractor", probability=0.5),
        ]
    )

#Original dataset
class IMDB_Dataset(Dataset):
    def __init__(self):
        #some code

    def _len__(self):
        #some code

    def __getitem__(self, idx):
        text = data['review_text'][idx]    #get text from sample
        embeddings = getWordEmbeddings(text)    #convert to sequence of word embeddings
        label = torch.tensor(data['sentiment'][idx])    #sentiment label
        return embeddings, label

#Adversarial dataset
class IMDB_Adversarial_Dataset(Dataset):
    def __init__(self):
        #some code

    def _len__(self):
        #some code

    def __getitem__(self, idx):
        text = data['review_text'][idx]    #get text from sample
        adversarial_text = tfms(text)    #apply adversarial transform
        embeddings = getWordEmbeddings(adversarial_text)    #convert to sequence of word embeddings
        label = torch.tensor(data['sentiment'][idx])    #sentiment label
        return embeddings, label

#Load pre-trained model
imdb_classifier = torch.load("IMDB_Classifier.pth")
imdb_classifier.eval()

#Set up the attacker
IMDB_attacker = attacker()
IMDB_attacker.model = imdb_classifier
IMDB_attacker.dataset = IMDB_Dataset()
IMDB_attacker.adversarial_dataset = IMDB_Adversarial_Dataset()
IMDB_attacker.criterion = ['accuracy', 'F1', 'BCELoss']

#Attack and get logs
IMDB_attacker.attack()
IMDB_attacker.get_crtierion_logs()
IMDB_attacker.show_best_attacks()
IMDB_attacker.show_worst_attacks()

#Maybe more functionalities?

rajaswa · 2020-05-21T11:49:12Z

This shall be a multi-step process, avoid limiting yourself to the above mentioned example functionalities and please feel free to discuss more functionalities.

Sharad24 · 2020-05-21T11:49:29Z

Looks pretty good! Can think of integration with this https://github.com/huggingface/nlp <https://github.com/huggingface/nlp> as well.

…

On 21-May-2020, at 5:17 PM, Rajaswa Patil ***@***.***> wrote: The ultimate usage can be something like this: from decepticonlp import attacker from deceptionlp.transforms import transforms import torch from torch.uitls.data import Dataset #define adversarial transforms tfms = transforms.Compose( [ transforms.AddChar(), transforms.ShuffleChar("RandomWordExtractor", True), transforms.VisuallySimilarChar(), transforms.TypoChar("RandomWordExtractor", probability=0.5), ] ) #Original dataset class IMDB_Dataset(Dataset): def __init__(self): #some code def _len__(self): #some code def __getitem__(self, idx): text = data['review_text'][idx] #get text from sample embeddings = getWordEmbeddings(text) #convert to sequence of word embeddings label = torch.tensor(data['sentiment'][idx]) #sentiment label return embeddings, label #Adversarial dataset class IMDB_Adversarial_Dataset(Dataset): def __init__(self): #some code def _len__(self): #some code def __getitem__(self, idx): text = data['review_text'][idx] #get text from sample adversarial_text = tfms(text) #apply adversarial transform embeddings = getWordEmbeddings(adversarial_text) #convert to sequence of word embeddings label = torch.tensor(data['sentiment'][idx]) #sentiment label return embeddings, label #Load pre-trained model imdb_classifier = torch.load("IMDB_Classifier.pth") imdb_classifier.eval() #Set up the attacker IMDB_attacker = attacker() IMDB_attacker.model = imdb_classifier IMDB_attacker.dataset = IMDB_Dataset() IMDB_attacker.adversarial_dataset = IMDB_Adversarial_Dataset() IMDB_attacker.criterion = ['accuracy', 'F1', 'BCELoss'] #Attack and get logs IMDB_attacker.attack() IMDB_attacker.get_crtierion_logs() IMDB_attacker.show_best_attacks() IMDB_attacker.show_worst_attacks() #Maybe more functionalities? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#70 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH72FJ4QABLTN23HJ6NFQTDRSUIDZANCNFSM4NGYMAWA>.

abheesht17 · 2020-05-21T12:01:32Z

Maybe we can add functionality to draw graphs as well for loss, accuracy, etc. for both the original dataset and the adversarial dataset. More of a utility function rather than a necessary one though.

Sharad24 · 2020-05-21T12:03:17Z

Would be helpful although if we’re doing that then a logger (Tensorboard, etc) would be much better.

…

On 21-May-2020, at 5:31 PM, abheesht17 ***@***.***> wrote: Maybe we can add functionality to draw graphs as well for loss, accuracy, etc. for both the original dataset and the adversarial dataset. More of a utility function rather than a necessary one though. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#70 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH72FJYG6I7JCHFJZJZVKFLRSUJ2VANCNFSM4NGYMAWA>.

rajaswa · 2020-05-21T12:08:00Z

Which one seems the better option:

Implement everything in crude way as shown in example above and then added logger, huggingface/nlp and other enhancements slowly over time
Start implementation with all these things taken into consideration beforehand

Sharad24 · 2020-05-21T12:09:19Z

To me, definitely point 1

Sharad24 · 2020-05-21T13:42:27Z

You can make a checklist here maybe with the suggestions here to keep track of the enhancements.

parantak · 2020-05-21T17:05:11Z

I am not sure if this would be feasible but introducing an option for the different available embeddings could be done. The attacker should have an option, if possible, depending on his model.

abheesht17 · 2020-05-21T17:41:03Z

The user will define it himself in his Dataset class; we don't have to bother about which embedding he has used.

abheesht17 · 2020-05-22T02:12:28Z

What should I check in the test function of the Attacker class?

Sharad24 · 2020-05-22T02:23:38Z

Tests for attacker class would be 'integration' type tests rather than unit tests. You'd want to have common example cases where the attacker does what it is supposed to do.

…

On Fri, 22 May 2020, 07:42 abheesht17, ***@***.***> wrote: What should I check in the test function of the Attacker class? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#70 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AH72FJZYVZMOX2UDCLWHX4DRSXNRRANCNFSM4NGYMAWA> .

rajaswa · 2020-05-22T04:08:08Z

@Sharad24 can you provide some reference examples for these 'integration' type tests?
Maybe from GenRL?

Sharad24 · 2020-05-22T05:15:30Z

Hmm the integration tests in GenRL are not that good and sort of brittle as of yet. Try reading from here: https://www.fullstackpython.com/integration-testing.html

The goal of them is that the individual units that are going to be used inside the attacker in our case the different transforms, etc., finally are able to work together through one API as they were intended. You don't really have to check the output from each individual unit for each different case as that is already done in their unit testing. Here, we only test how good the objects work together and if they are any brittle points in their interfacing.

Although, if there are some methods in the Attacker class that are working as individual units then there should be unit tests for them.

abheesht17 · 2020-05-22T06:17:55Z

Thanks! Will do

rajaswa added enhancement New feature or request help wanted Extra attention is needed Priority: High labels May 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing Attacker class #70

Implementing Attacker class #70

rajaswa commented May 21, 2020

abheesht17 commented May 21, 2020

rajaswa commented May 21, 2020

rajaswa commented May 21, 2020

Sharad24 commented May 21, 2020 via email

abheesht17 commented May 21, 2020

Sharad24 commented May 21, 2020 via email

rajaswa commented May 21, 2020

Sharad24 commented May 21, 2020

Sharad24 commented May 21, 2020

parantak commented May 21, 2020

abheesht17 commented May 21, 2020

abheesht17 commented May 22, 2020

Sharad24 commented May 22, 2020 via email

rajaswa commented May 22, 2020

Sharad24 commented May 22, 2020 •

edited

Loading

abheesht17 commented May 22, 2020

Implementing Attacker class #70

Implementing Attacker class #70

Comments

rajaswa commented May 21, 2020

abheesht17 commented May 21, 2020

rajaswa commented May 21, 2020

rajaswa commented May 21, 2020

Sharad24 commented May 21, 2020 via email

abheesht17 commented May 21, 2020

Sharad24 commented May 21, 2020 via email

rajaswa commented May 21, 2020

Sharad24 commented May 21, 2020

Sharad24 commented May 21, 2020

parantak commented May 21, 2020

abheesht17 commented May 21, 2020

abheesht17 commented May 22, 2020

Sharad24 commented May 22, 2020 via email

rajaswa commented May 22, 2020

Sharad24 commented May 22, 2020 • edited Loading

abheesht17 commented May 22, 2020

Sharad24 commented May 22, 2020 •

edited

Loading