Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Contextualized bias mitigation #5176

Merged
merged 58 commits into from
Jun 2, 2021
Merged
Show file tree
Hide file tree
Changes from 47 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
79c6c33
added linear and hard debiasers
Apr 13, 2021
e23057c
worked on documentation
Apr 14, 2021
fcc3d34
committing changes before branch switch
Apr 14, 2021
7d00910
committing changes before switching branch
Apr 15, 2021
668a513
finished bias direction, linear and hard debiasers, need to write tests
Apr 15, 2021
91029ef
finished bias direction test
Apr 15, 2021
396b245
Commiting changes before switching branch
Apr 16, 2021
a8c22a1
finished hard and linear debiasers
Apr 16, 2021
ef6a062
finished OSCaR
Apr 17, 2021
2c873cb
bias mitigators tests and bias metrics remaining
Apr 17, 2021
d97a526
added bias mitigator tests
Apr 18, 2021
8460281
added bias mitigator tests
Apr 18, 2021
5a76922
finished tests for bias mitigation methods
Apr 19, 2021
85cb107
Merge remote-tracking branch 'origin/main' into arjuns/post-processin…
Apr 19, 2021
8e55f28
fixed gpu issues
Apr 19, 2021
b42b73a
fixed gpu issues
Apr 19, 2021
37d8e33
fixed gpu issues
Apr 20, 2021
31b1d2c
resolve issue with count_nonzero not being differentiable
Apr 20, 2021
a1f4f2a
merged main into post-processing-debiasing
Apr 21, 2021
36cebe3
added more references
Apr 21, 2021
88c083b
Merge branch 'main' of https://github.com/allenai/allennlp into arjun…
Apr 28, 2021
86081ee
fairness during finetuning
Apr 29, 2021
ae592d8
finished bias mitigator wrapper
May 5, 2021
2501b8c
added reference
May 5, 2021
f664dfb
updated CHANGELOG and fixed minor docs issues
May 5, 2021
595449d
move id tensors to embedding device
May 5, 2021
dc4793f
Merge branch 'main' into arjuns/contextualized-bias-mitigation
ArjunSubramonian May 6, 2021
0cdcf89
fixed to use predetermined bias direction
May 6, 2021
f254128
fixed minor doc errors
May 6, 2021
1be00c8
snli reader registration issue
May 6, 2021
a6c9bf6
fixed _pretrained from params issue
May 6, 2021
6624680
fixed device issues
May 6, 2021
90a372e
evaluate bias mitigation initial commit
May 9, 2021
c6a2dbf
finished evaluate bias mitigation
May 10, 2021
7797659
handles multiline prediction files
May 10, 2021
bbfddd7
fixed minor bugs
May 11, 2021
f2f3fc3
fixed minor bugs
May 11, 2021
4e79de7
improved prediction diff JSON format
May 11, 2021
5dae69f
merged main
May 11, 2021
254676f
forgot to resolve a conflict
May 11, 2021
26d8dff
Merge branch 'main' of https://github.com/allenai/allennlp into arjun…
May 13, 2021
1ae5e99
Refactored evaluate bias mitigation to use NLI metric
May 13, 2021
e2cc38e
Added SNLIPredictionsDiff class
May 17, 2021
c34cf31
ensured dataloader is same for bias mitigated and baseline models
May 17, 2021
fdb9ea7
finished evaluate bias mitigation
May 18, 2021
3efffd2
Merge branch 'main' into arjuns/contextualized-bias-mitigation
AkshitaB May 18, 2021
c47de58
Update CHANGELOG.md
AkshitaB May 18, 2021
2b8cf09
Merge branch 'main' of https://github.com/allenai/allennlp into arjun…
May 20, 2021
33d6267
Replaced local data files with github raw content links
May 20, 2021
ec53a05
Update allennlp/fairness/bias_mitigator_applicator.py
ArjunSubramonian May 25, 2021
4afb7f2
deleted evaluate_bias_mitigation from git tracking
May 26, 2021
21bed9d
removed evaluate-bias-mitigation instances from rest of repo
May 26, 2021
fefcbad
Merge branch 'arjuns/contextualized-bias-mitigation' of https://githu…
May 26, 2021
972ea60
addressed Akshita's comments
May 26, 2021
b4011cb
moved bias mitigator applicator test to allennlp-models
Jun 2, 2021
4d7fffb
Merge branch 'main' into arjuns/contextualized-bias-mitigation
AkshitaB Jun 2, 2021
22a5964
removed unnecessary files
Jun 2, 2021
bd727dd
Merge branch 'main' into arjuns/contextualized-bias-mitigation
ArjunSubramonian Jun 2, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added

- Added `TaskSuite` base class and command line functionality for running [`checklist`](https://github.com/marcotcr/checklist) test suites, along with implementations for `SentimentAnalysisSuite`, `QuestionAnsweringSuite`, and `TextualEntailmentSuite`. These can be found in the `allennlp.confidence_checks.task_checklists` module.
- Added `BiasMitigatorApplicator`, which wraps any Model and mitigates biases by finetuning
on a downstream task.
- Added `EvaluateBiasMitigation`, which evaluates the effectiveness of bias mitigation by computing
SNLI-related metrics for a bias-mitigated and baseline model.
- Added `allennlp diff` command to compute a diff on model checkpoints, analogous to what `git diff` does on two files.
- Added `nn.util.distributed_device()` helper function.
- Added `allennlp.nn.util.load_state_dict` helper function.
Expand All @@ -36,7 +40,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- When `PretrainedTransformerIndexer` folds long sequences, it no longer loses the information from token type ids.
- Fixed documentation for `GradientDescentTrainer.cuda_device`.


## [v2.4.0](https://github.com/allenai/allennlp/releases/tag/v2.4.0) - 2021-04-22

### Added
Expand All @@ -62,8 +65,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Add new dimension to the `interpret` module: influence functions via the `InfluenceInterpreter` base class, along with a concrete implementation: `SimpleInfluence`.
- Added a `quiet` parameter to the `MultiProcessDataLoading` that disables `Tqdm` progress bars.
- The test for distributed metrics now takes a parameter specifying how often you want to run it.
- Created the fairness module and added four fairness metrics: `Independence`, `Separation`, and `Sufficiency`.
- Added three bias metrics to the fairness module: `WordEmbeddingAssociationTest`, `EmbeddingCoherenceTest`, `NaturalLanguageInference`, and `AssociationWithoutGroundTruth`.
- Created the fairness module and added three fairness metrics: `Independence`, `Separation`, and `Sufficiency`.
- Added four bias metrics to the fairness module: `WordEmbeddingAssociationTest`, `EmbeddingCoherenceTest`, `NaturalLanguageInference`, and `AssociationWithoutGroundTruth`.
- Added four bias direction methods (`PCABiasDirection`, `PairedPCABiasDirection`, `TwoMeansBiasDirection`, `ClassificationNormalBiasDirection`) and four bias mitigation methods (`LinearBiasMitigator`, `HardBiasMitigator`, `INLPBiasMitigator`, `OSCaRBiasMitigator`).

### Changed
Expand Down
1 change: 1 addition & 0 deletions allennlp/commands/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from allennlp.common.plugins import import_plugins
from allennlp.common.util import import_module_and_submodules
from allennlp.commands.checklist import CheckList
from allennlp.fairness.evaluate_bias_mitigation import EvaluateBiasMitigation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we including this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, we are not!


logger = logging.getLogger(__name__)

Expand Down
1 change: 1 addition & 0 deletions allennlp/data/dataset_readers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@
from allennlp.data.dataset_readers.sequence_tagging import SequenceTaggingDatasetReader
from allennlp.data.dataset_readers.sharded_dataset_reader import ShardedDatasetReader
from allennlp.data.dataset_readers.text_classification_json import TextClassificationJsonReader
from allennlp.data.dataset_readers.snli import SnliReader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is present in allennlp-models right? Do we need to have it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's used in the test, but the environment in which the test is run doesn't have allennlp-models.

122 changes: 122 additions & 0 deletions allennlp/data/dataset_readers/snli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
from typing import Dict, Optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this just copied over from allennlp-models?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's used in the test, but the environment in which the test is run doesn't have allennlp-models.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to have some tests in allennlp-models for this code. For instance, we have some checklist tests there, because that's where the models are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the test under allennlp-models.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this file now, right?

import json
import logging

from overrides import overrides

from allennlp.common.file_utils import cached_path
from allennlp.data.dataset_readers.dataset_reader import DatasetReader
from allennlp.data.fields import Field, TextField, LabelField, MetadataField
from allennlp.data.instance import Instance
from allennlp.data.token_indexers import SingleIdTokenIndexer, TokenIndexer
from allennlp.data.tokenizers import Tokenizer, SpacyTokenizer, PretrainedTransformerTokenizer

logger = logging.getLogger(__name__)


def maybe_collapse_label(label: str, collapse: bool):
"""
Helper function that optionally collapses the "contradiction" and "neutral" labels
into "non-entailment".
"""
assert label in ["contradiction", "neutral", "entailment"]
if collapse and label in ["contradiction", "neutral"]:
return "non-entailment"
return label


@DatasetReader.register("snli_for_bias")
class SnliReader(DatasetReader):
"""
Reads a file from the Stanford Natural Language Inference (SNLI) dataset. This data is
formatted as jsonl, one json-formatted instance per line. The keys in the data are
"gold_label", "sentence1", and "sentence2". We convert these keys into fields named "label",
"premise" and "hypothesis", along with a metadata field containing the tokenized strings of the
premise and hypothesis.
Registered as a `DatasetReader` with name "snli".
# Parameters
tokenizer : `Tokenizer`, optional (default=`SpacyTokenizer()`)
We use this `Tokenizer` for both the premise and the hypothesis. See :class:`Tokenizer`.
token_indexers : `Dict[str, TokenIndexer]`, optional (default=`{"tokens": SingleIdTokenIndexer()}`)
We similarly use this for both the premise and the hypothesis. See :class:`TokenIndexer`.
combine_input_fields : `bool`, optional
(default=`isinstance(tokenizer, PretrainedTransformerTokenizer)`)
If False, represent the premise and the hypothesis as separate fields in the instance.
If True, tokenize them together using `tokenizer.tokenize_sentence_pair()`
and provide a single `tokens` field in the instance.
collapse_labels : `bool`, optional (default=`False`)
If `True`, the "neutral" and "contradiction" labels will be collapsed into "non-entailment";
"entailment" will be left unchanged.
"""

def __init__(
self,
tokenizer: Optional[Tokenizer] = None,
token_indexers: Dict[str, TokenIndexer] = None,
combine_input_fields: Optional[bool] = None,
collapse_labels: bool = False,
**kwargs,
) -> None:
super().__init__(
manual_distributed_sharding=True, manual_multiprocess_sharding=True, **kwargs
)
self._tokenizer = tokenizer or SpacyTokenizer()
if isinstance(self._tokenizer, PretrainedTransformerTokenizer):
assert not self._tokenizer._add_special_tokens
self._token_indexers = token_indexers or {"tokens": SingleIdTokenIndexer()}
if combine_input_fields is not None:
self._combine_input_fields = combine_input_fields
else:
self._combine_input_fields = isinstance(self._tokenizer, PretrainedTransformerTokenizer)
self.collapse_labels = collapse_labels

@overrides
def _read(self, file_path: str):
# if `file_path` is a URL, redirect to the cache
file_path = cached_path(file_path)
with open(file_path, "r") as snli_file:
example_iter = (json.loads(line) for line in snli_file)
filtered_example_iter = (
example for example in example_iter if example.get("gold_label") != "-"
)
for example in self.shard_iterable(filtered_example_iter):
label = example.get("gold_label")
premise = example["sentence1"]
hypothesis = example["sentence2"]
yield self.text_to_instance(premise, hypothesis, label)

@overrides
def text_to_instance(self, premise, hypothesis, label: str = None) -> Instance: # type: ignore

fields: Dict[str, Field] = {}
premise = self._tokenizer.tokenize(premise)
hypothesis = self._tokenizer.tokenize(hypothesis)

if self._combine_input_fields:
tokens = self._tokenizer.add_special_tokens(premise, hypothesis)
fields["tokens"] = TextField(tokens)
else:
premise_tokens = self._tokenizer.add_special_tokens(premise)
hypothesis_tokens = self._tokenizer.add_special_tokens(hypothesis)
fields["premise"] = TextField(premise_tokens)
fields["hypothesis"] = TextField(hypothesis_tokens)

metadata = {
"premise_tokens": [x.text for x in premise_tokens],
"hypothesis_tokens": [x.text for x in hypothesis_tokens],
}
fields["metadata"] = MetadataField(metadata)

if label:
maybe_collapsed_label = maybe_collapse_label(label, self.collapse_labels)
fields["label"] = LabelField(maybe_collapsed_label)

return Instance(fields)

@overrides
def apply_token_indexers(self, instance: Instance):
if "tokens" in instance.fields:
instance.fields["tokens"]._token_indexers = self._token_indexers # type: ignore
else:
instance.fields["premise"]._token_indexers = self._token_indexers # type: ignore
instance.fields["hypothesis"]._token_indexers = self._token_indexers # type: ignore
17 changes: 16 additions & 1 deletion allennlp/fairness/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@

1. measure the fairness of models according to multiple definitions of fairness
2. measure bias amplification
3. debias embeddings during training time and post-processing
3. mitigate bias in static and contextualized embeddings during training time and
post-processing
"""

from allennlp.fairness.fairness_metrics import Independence, Separation, Sufficiency
Expand All @@ -25,3 +26,17 @@
INLPBiasMitigator,
OSCaRBiasMitigator,
)
from allennlp.fairness.bias_utils import load_words, load_word_pairs
from allennlp.fairness.bias_mitigator_applicator import BiasMitigatorApplicator
from allennlp.fairness.bias_mitigator_wrappers import (
HardBiasMitigatorWrapper,
LinearBiasMitigatorWrapper,
INLPBiasMitigatorWrapper,
OSCaRBiasMitigatorWrapper,
)
from allennlp.fairness.bias_direction_wrappers import (
PCABiasDirectionWrapper,
PairedPCABiasDirectionWrapper,
TwoMeansBiasDirectionWrapper,
ClassificationNormalBiasDirectionWrapper,
)
Loading