Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

Summary: This work explores optimal confidence functions within the threshold-based auto-labeling (TBAL) framework, aimed at improving coverage and accuracy in labeling workflows.

Keywords: Auto Labeling, Confidence Functions, Active Learning, Selective Classification

Getting Started

First things first, lets create the conda environment as follows,

Environment Setup

Create the Conda environment:

conda env create -f environment.yml

Activate the environment:

conda activate tbal

Now lets run some examples,

Running the Code

The code supports two main modes:

1. Basic Run Mode

This mode allows a straightforward execution with a predefined configuration file. This is ideal for initial exploration without hyperparameter tuning. To start:

Navigate to the scripts directory:
```
cd ./scripts
```
Run the MNIST LeNet example:
```
./run_mnist_tbal_eval_full_fixed.sh
```

This script uses a fixed configuration designed for general testing purposes. You can find configuration files for different models and datasets in ./configs/calib-exp. For a detailed explanation of our hyperparameter tuning for TBAL, please refer to our paper published at NeurIPS, 2024.

2. Hyperparameter Search Mode

This search mode reproduces the hyperparameter search used in our experiments. For those who want to perform a full hyperparameter search similar to what was done in the paper, configuration files for various models and settings can be found under configs/calib-exp/hyp-search/tbal.

TBAL Configurations: configs/calib-exp/hyp-search/tbal/mnist_lenet/mnist_lenet_common_fixed.json
Train-time Configurations: mnist_lenet/mnist_lenet_train_fixed.json
Post-hoc Configurations: mnist_lenet/mnist_lenet_post_fixed_std_xent.json

Additional Resources

Similarly, you'll find scripts for running CIFAR-10-CNN, TinyImageNet-MLP, and 20 Newsgroups-MLP conveniently located in the same directory.

Compute Resources

Our experiments utilized the following GPUs:

NVIDIA RTX A6000
NVIDIA GeForce RTX 4090

Citation

If you find this work useful, please consider citing our paper:

@article{author2024tbal,
  title={Pearls from Pebbles: Improved Confidence Functions for Auto-labeling},
  author={Harit Vishwakarma, Reid (Yi) Chen, Sui Jiet Tay, Satya Sai Srinath Namburi, Frederic Sala, Ramya Korlakai Vinayak},
  journal={NeurIPS},
  year={2024},
  url={https://arxiv.org/pdf/2404.16188}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
figs		figs
notebooks		notebooks
papertools		papertools
scripts		scripts
src		src
temp/logs		temp/logs
test		test
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

Getting Started

Environment Setup

Running the Code

1. Basic Run Mode

2. Hyperparameter Search Mode

Additional Resources

Compute Resources

Citation

About

Releases

Packages

Contributors 2

Languages

harit7/TBAL-Colander-NeurIPS-24

Folders and files

Latest commit

History

Repository files navigation

Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

Getting Started

Environment Setup

Running the Code

1. Basic Run Mode

2. Hyperparameter Search Mode

Additional Resources

Compute Resources

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages