GitHub - tengxiao1/GSIL: How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective (EMNLP 2024)

How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective (EMNLP 2024)

Install Enviroment

pip install -r requirements.txt

1. Generation Training Dataset

bash scripts/generate.sh

2. Combine Generation Data

You only need to execute it when using the generation.py script.

python gsil/combine.py --data_dir /path/of/your/iter_n_data

The final data will be stored in the train_data folder under /path/of/your/iter_n_data

3. Training

You can use the following scripts to train the model.

bash scripts/finetune

4. Evaluation

For our evaluation on the Open LLM Leaderboard, please use the lm-evaluation-harness repository at v0.3.1, which is consistent with open_llm_leaderboard. Also, note that we set the number of few shot examples to be the same as instructed on the Leaderboard.

Humaneval：https://github.com/OpenBMB/Eurus?tab=readme-ov-file

Mt-bench：https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge

If you find our repo to be useful, please cite our paper:

@inproceedings{xiao2024leverage,
  title={How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective},
  author={Xiao, Teng and Li, Mingxiao and Yuan, Yige and Zhu, Huaisheng and Cui, Chao and Honavar, Vasant},
  booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  pages={13413--13426},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs		configs
data		data
eval		eval
gsil		gsil
scripts		scripts
.DS_Store		.DS_Store
README.md		README.md
Readme		Readme
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective (EMNLP 2024)

Install Enviroment

1. Generation Training Dataset

2. Combine Generation Data

3. Training

4. Evaluation

About

Releases

Packages

Languages

tengxiao1/GSIL

Folders and files

Latest commit

History

Repository files navigation

How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective (EMNLP 2024)

Install Enviroment

1. Generation Training Dataset

2. Combine Generation Data

3. Training

4. Evaluation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages