Skip to content

xiaoniu-578fa6bff964d005/AcceleratedUnbiasedWatermark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models

This repository contains the code for the paper "Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models" by Zhengmian Hu and Heng Huang, presented at the Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024).

Introduction

Watermarking techniques help identify AI-generated content, but they need to be accelerated to become practical. This project explores the combination and trade-off between watermark strength and speculative sampling efficiency, in the context of accelerating the generation of watermarked tokens for large language models.

illustration

  • We propose a two-reweight framework that allows for the integration of unbiased watermarking and speculative sampling techniques while preserving the output distribution.
  • We prove a no-go theorem, demonstrating that it is impossible to simultaneously maintain the highest watermark strength and the highest sampling efficiency within the two-reweight framework when the vocabulary size is greater than 2.
  • We present two practical algorithms that prioritize either watermark strength or sampling efficiency.

Repository Structure

  • unbiased_watermark/: Implements unbiased reweighting functions that preserve output quality.
  • accuwm/: Contains five language model inference algorithms:
    • No watermark, no acceleration
    • Watermark, no acceleration
    • No watermark, acceleration
    • Acceleration while maintaining watermark strength
    • Watermarking while maintaining speculative sampling efficiency
  • experiments/: Includes experiments from the paper, requiring approximately 1200 A6000 GPU hours.
  • analysis/: Aggregates experimental results into figures and tables presented in the paper.

Citation

If you find this work useful in your research, please consider citing our paper:

@inproceedings{
  hu2024inevitable,
  title={Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models},
  author={Hu, Zhengmian and Huang, Heng},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
  year={2024},
}

About

Unbiased watermark with speculative sampling

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published