Skip to content

dsouzadaniel/long_tail

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Tale Of Two Long Tails

Official :octocat: Codebase for A Tale Of Two Long Tails

If you use this software, please consider citing:

  title={A Tale Of Two Long Tails},
  author={D'souza, Daniel and Nussbaum, Zach and Agarwal, Chirag and Hooker, Sara},
  journal={arXiv preprint arXiv:2107.13098},
  year={2021}}

Long Tail Examples 🚨 tldr: Examples of Atypical and Noisy Error. The former is reducible with the introduction of information and the other is not! 🚨

Setup

This repository is built using PyTorch:fire:. You can install the necessary libraries by pip install -r requirements.txt

Datasets

  1. Download CIFAR-10/CIFAR-100 LongTail Datasets
  2. Unzip above files in folder "datasets" in main directory

Usage

The scripts to train CIFAR-10/CIFAR-100 models on all datasets is train_c10.py/train_c100.py.

Training

  1. Set Variable MSP_AUG_PCT to a value between (0,1). This controls how much of the dataset to augment based on the MSP.Default is 0.2 ( Targeted Augment Variant )

  2. Set Variable TRAIN_DATASET to either 'cifar10'(Original), 'N20_A20_T60'(C-Score), 'N20_A20_TX2'(Frequency)

  3. Run python train_c10.py to train CIFAR-10 models

The above steps can be repeated for CIFAR-100 by using train_c100.py

Results

CIFAR-10 alt text

CIFAR-100 alt text

Visualization Code will be added shortly.

Licenses

Note that the code in this repository is licensed under MIT License. Please carefully check them before use.

Questions?

If you have questions/suggestions, please feel free to email or create github issues.

Releases

No releases published

Packages

No packages published

Languages