Towards Data Contamination Detection for Modern Large Language Models: Limitations, Inconsistencies, and Oracle Challenges
This repository contains the dataset and code of the paper:
Towards Data Contamination Detection for Modern Large Language Models: Limitations, Inconsistencies, and Oracle Challenges
COLING 2025
Our evaluation data are released in the data folder. These data files are processed versions of data that is found online.
All of the code that is part of running our evaluations is provided in data. Both the WPQ and Local Order Quiz is run using this while the Token Overlap method is run here and the Canonical Order is ran here and the Min-K% is ran here.
Here we provide an example of setting up the environment
# Environment setup
conda create -n contamination python=3.9 -y
conda activate contamination
# install dependency
pip install -r requirements.txt