Code for the paper: TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods
Preprint on arXiv: https://arxiv.org/abs/2407.21630v1
TAROT models are available on 🤗 Huggingface.
Clone the repository locally:
git clone https://github.com/hornetsecurity/tarot
cd tarot
Install the requirements using Poetry:
pip install poetry
poetry install
- All datasets are hosted on 🤗 Huggingface datasets.
- Evaluation datasets and loaded and preprocessed in
tarot/utils.py
. - We use the Yelp review dataset to train the generation models.
Experimental scripts can be found in the scripts
folder:
ppo-dpo.sh
: PPO and DPO fine-tuning on the Yelp dataset using policy optimization.imdb.sh
,bac.sh
,amt.sh
: Create for each dataset an authorship classifier and an utility classifier for evaluation, obfuscates each dataset and performs evaluation of generated text.
Running evaluation scripts will result in the folowwing folder structure:
tarot
├── imdb-10
│ ├── authorship_checkpoint
│ ├── deberta-v3-authorship-imdb-10
│ ├── deberta-v3-utility-imdb-10
│ ├── utility_checkpoint
│ ├── imdb-10-test-DPO.csv
│ ├── imdb-10-test-PPO.csv
├── imdb-20
├── src
├── LICENSE
├── pyproject.toml
└── README.md
Where deberta-v3-authorship-imdb-10
and deberta-v3-utility-imdb-10
are respectively the authorship attribution and the utility classifier. imdb-10-test-DPO.csv
and imdb-10-test-PPO.csv
the resulting obfuscated dataset using DPO and PPO. authorship_checkpoint
and utility_checkpoint
are the evaluation classfier checkpoints.
@misc{loiseau2024tarot,
title={TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods},
author={Gabriel Loiseau and Damien Sileo and Damien Riquet and Maxime Meyer and Marc Tommasi},
year={2024},
eprint={2407.21630},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2407.21630},
}