Skip to content

The data contamination repository for the LLM-JP project. Forked from @eiei7

Notifications You must be signed in to change notification settings

llm-jp/llm-jp-data-contamination

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LM Contamination Task

This repo works for llm-jp eval-tuning-wg task9: データリークの評価

Introduction

Oscar Sainz, et al. firstly proposed the idea that the model is contaminated if it is able to generate examples of the dataset. However, recent works show that this method can be unreliable and subject to failure. S. Golchin & M. Surdeanu(https://arxiv.org/pdf/2311.06233.pdf) argue that such failures can result either from the sparsity introduced by the request to reproduce the first instances of a dataset split or from the inability to bypass the safety filters set by the model provider when the model is asked to generate copyrighted content like dataset instances.

Osainz has posted the related works on huggingface community

  • [Time Travel in LLMs: Tracing Data Contamination in Large Language Models (Golchin and Surdeanu, 2023)][reference]

  • [Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation (Li 2023)][reference] reference

  • [Detecting Pretraining Data from Large Language Models (Shi et al., 2023)][reference] reference

  • [Proving Test Set Contamination in Black Box Language Models (Oren et al., 2023)][reference] reference

  • [Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models (Golchin and Surdeanu, 2023)][reference] reference

  • [Investigating Data Contamination in Modern Benchmarks for Large Language Models (Deng et al., 2023)][reference] reference

  • [Rethinking Benchmark and Contamination for Language Models with Rephrased Samples (Yang et al., 2023)][reference] reference

Progress

So far, this repo implementated part of S. Golchin & M. Surdeanu(https://arxiv.org/pdf/2311.06233.pdf)'s work.

Experiment Results

WNLI

GPT3.5

BLUERT:

  • with guide 0.5124241530895233
  • without guide 0.22064677874247232 RouGEL:
  • with guide 0.34238831625188737
  • without guide 0.09239756877931599

GPT4

BLUERT:

  • with guide 0.49290904998779295
  • without guide 0.46190741956233977
  • with guide 0.32426375556561493
  • without guide 0.2879418270645807

About

The data contamination repository for the LLM-JP project. Forked from @eiei7

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published