Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain

Shortcut learning occurs when a deep neural network overly relies on spurious correlations in the training dataset in order to solve downstream tasks. Prior works have shown how this impairs the compositional generalization capability of deep learning models. To address this problem, we propose a novel approach to mitigate shortcut learning in uncontrolled target domains. Our approach extends the training set with an additional dataset (the source domain), which is specifically designed to facilitate learning independent representations of basic visual factors. We benchmark our idea on generated target domains where we explicitly control shortcut opportunities as well as real-world target domains. Furthermore, we analyze the effect of different specifications of the source domain and the network architecture on compositional generalization. Our main finding is that leveraging data from a source domain is an effective way to mitigate shortcut learning. By promoting independence across different factors of variation in the learned representations, networks can learn to consider only predictive factors and ignore potential shortcut factors during inference.

For more information about this work, please read our ECCV 2022 paper:

Saranrittichai, P., Mummadi, C., Blaiotta, C., Munoz, M., & Fischer, V. (2022). Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain. In Proceedings of the European Conference on Computer Vision (ECCV).

Installation

First we recommend to setup a python environment using the provided environment.yml and install the package:

conda env create -f environment.yml
source activate sourcegen
pip install -e .

Navigate to data/diagvibsix and follow the instruction on diagvib_setup_instruction.txt to prepare data for the DiagViB-6 framework. In this work, we customize DiagViB-6 for our use cases. Official DiagViB-6 release can be found here.

Run Studies

We provide python scripts to run studies on the color animal dataset with FactorSRC variations. For fully-correlated setup, the study can be performed by running the script below:

python -m sourcegen.studies.run_study_fully_correlated

Similarly, for semi-correlated setup, the study can be performed by running the script below:

python -m sourcegen.studies.run_study_semi_correlated

Questions and Reference

Please contact Piyapat Saranrittichai or Volker Fischer with any questions about our work and reference it, if it benefits your research:

@InProceedings{Saranrittichai_2022_ECCV,
author = {Saranrittichai, Piyapat and Mummadi, Chaithanya Kumar and Blaiotta, Claudia and Munoz, Mauricio and Fischer, Volker},
title = {
Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {October},
year = {2022}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs		configs
data		data
diagvibsix		diagvibsix
sourcegen		sourcegen
studies		studies
.gitignore		.gitignore
3rd-party-licenses.txt		3rd-party-licenses.txt
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain

Table of Contents

Installation

Run Studies

Questions and Reference

About

Releases

Packages

Languages

License

boschresearch/sourcegen

Folders and files

Latest commit

History

Repository files navigation

Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain

Table of Contents

Installation

Run Studies

Questions and Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages