Mixture of Distribution (MoD)

This repository contains the codebase for the paper MoD: A Distribution-Based Approach for Merging Large Language Models.
⭐ The paper has been accepted in Neurips 2024's LLM Merging Competition.

Large language models (LLMs) have enabled the development of numerous specialized, task-specific variants. However, the maintenance and deployment of these individual models present substantial challenges in terms of resource utilization and operational efficiency. In this work, we propose the Mixture of Distributions (MoD) framework, a novel approach for merging LLMs that operates directly on their output probability distributions, rather than on model weights. Unlike traditional weight-averaging methods, MoD effectively preserves the specialized capabilities of individual models while enabling efficient knowledge sharing across tasks. Through extensive experimentation on mathematical reasoning benchmarks using Qwen2.5 models, we demonstrate that MoD significantly outperforms existing model merging techniques across multiple benchmarks.

Project Structure

configs: Configuration files for running the program and evaluation phases.
mod: Source code for the mod package.
scripts: Scripts to execute the main MoD program and evaluation phases.
tests: Test cases to validate functionality before release.

Setup

To set up the environment and run the program, follow the steps below:

1. Create a Virtual Environment

Create a Python virtual environment and install the required dependencies.

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

2. Configure Hugging Face Token (Optional)

If certain resources from the Hugging Face Hub are required, configure your API token.

export HF_AUTH_TOKEN="YOUR_HF_TOKEN"

3. Run the Program

After setting up the environment and configurations, run the program with the provided script.

python ./scripts/mod.py --config ./configs/1.5B/mod_config.yml --out_path qwen2.5-1.5B-mod

Evaluation

For evaluation, please refer to the instructions in the mod-evaluate repository.

Testing

To ensure code integrity, create a pull request and run tests before pushing changes.

pip install pytest
pytest ./tests/

Cite our Work

@inproceedings{
dang2024mod,
title={MoD: A Distribution-Based Approach for Merging Large Language Models},
author={Quy-Anh Dang and Chris Ngo},
booktitle={LLM Merging Competition at NeurIPS 2024},
year={2024},
url={https://openreview.net/forum?id=v2tZ9bNcS5}
}

@misc{dang2024moddistributionbasedapproachmerging,
      title={MoD: A Distribution-Based Approach for Merging Large Language Models}, 
      author={Quy-Anh Dang and Chris Ngo},
      year={2024},
      eprint={2411.00406},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2411.00406}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
configs		configs
mod		mod
scripts		scripts
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mixture of Distribution (MoD)

Project Structure

Setup

1. Create a Virtual Environment

2. Configure Hugging Face Token (Optional)

3. Run the Program

Evaluation

Testing

Cite our Work

About

Releases

Packages

Contributors 2

Languages

knoveleng/mod

Folders and files

Latest commit

History

Repository files navigation

Mixture of Distribution (MoD)

Project Structure

Setup

1. Create a Virtual Environment

2. Configure Hugging Face Token (Optional)

3. Run the Program

Evaluation

Testing

Cite our Work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages