Skip to content

Commit

Permalink
Merge pull request #3596 from hiyouga/dev_doc
Browse files Browse the repository at this point in the history
Add CLI document
  • Loading branch information
hiyouga authored May 6, 2024
2 parents a34f526 + 047313f commit c8cd00b
Show file tree
Hide file tree
Showing 70 changed files with 1,532 additions and 1,079 deletions.
69 changes: 36 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -276,18 +276,19 @@ huggingface-cli login
| ------------ | ------- | --------- |
| python | 3.8 | 3.10 |
| torch | 1.13.1 | 2.2.0 |
| transformers | 4.37.2 | 4.39.3 |
| datasets | 2.14.3 | 2.18.0 |
| accelerate | 0.27.2 | 0.28.0 |
| transformers | 4.37.2 | 4.40.1 |
| datasets | 2.14.3 | 2.19.1 |
| accelerate | 0.27.2 | 0.30.0 |
| peft | 0.9.0 | 0.10.0 |
| trl | 0.8.1 | 0.8.1 |
| trl | 0.8.1 | 0.8.6 |

| Optional | Minimum | Recommend |
| ------------ | ------- | --------- |
| CUDA | 11.6 | 12.2 |
| deepspeed | 0.10.0 | 0.14.0 |
| bitsandbytes | 0.39.0 | 0.43.0 |
| flash-attn | 2.3.0 | 2.5.6 |
| bitsandbytes | 0.39.0 | 0.43.1 |
| vllm | 0.4.0 | 0.4.2 |
| flash-attn | 2.3.0 | 2.5.8 |

### Hardware Requirement

Expand All @@ -305,24 +306,15 @@ huggingface-cli login

## Getting Started

### Data Preparation

Please refer to [data/README.md](data/README.md) for checking the details about the format of dataset files. You can either use datasets on HuggingFace / ModelScope hub or load the dataset in local disk.

> [!NOTE]
> Please update `data/dataset_info.json` to use your custom dataset.
### Dependence Installation
### Installation

```bash
git clone https://github.com/hiyouga/LLaMA-Factory.git
conda create -n llama_factory python=3.10
conda activate llama_factory
cd LLaMA-Factory
pip install -e .[metrics]
```

Extra dependencies available: deepspeed, metrics, galore, badam, vllm, bitsandbytes, gptq, awq, aqlm, qwen, modelscope, quality
Extra dependencies available: metrics, deepspeed, bitsandbytes, vllm, galore, badam, gptq, awq, aqlm, qwen, modelscope, quality

<details><summary>For Windows users</summary>

Expand All @@ -336,19 +328,41 @@ To enable FlashAttention-2 on the Windows platform, you need to install the prec

</details>

### Train with LLaMA Board GUI (powered by [Gradio](https://github.com/gradio-app/gradio))
### Data Preparation

Please refer to [data/README.md](data/README.md) for checking the details about the format of dataset files. You can either use datasets on HuggingFace / ModelScope hub or load the dataset in local disk.

> [!NOTE]
> Please update `data/dataset_info.json` to use your custom dataset.
### Quickstart

Use the following 3 commands to conduct LoRA **fine-tuning**, **inference** and **merging** for Llama3-8B-Instruct model, respectively.

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
```

See [examples/README.md](examples/README.md) for advanced usage (including distributed training).

> [!TIP]
> Use `llamafactory-cli help` to show help information.
### Use LLaMA Board GUI (powered by [Gradio](https://github.com/gradio-app/gradio))

> [!IMPORTANT]
> LLaMA Board GUI only supports training on a single GPU, please use [CLI](#train-with-command-line-interface) for distributed training.
> LLaMA Board GUI only supports training on a single GPU.
#### Use local environment

```bash
llamafactory-cli webui
CUDA_VISIBLE_DEVICES=0 llamafactory-cli webui
```

> [!TIP]
> To modify the default setting in the LLaMA Board GUI, you can use environment variables, e.g., `export CUDA_VISIBLE_DEVICES=0 GRADIO_SERVER_NAME=0.0.0.0 GRADIO_SERVER_PORT=7860 GRADIO_SHARE=False` (use `set` command on Windows OS).
> To modify the default setting in the LLaMA Board GUI, you can use environment variables, e.g., `export GRADIO_SERVER_NAME=0.0.0.0 GRADIO_SERVER_PORT=7860 GRADIO_SHARE=False` (use `set` command on Windows OS).
<details><summary>For Alibaba Cloud users</summary>

Expand Down Expand Up @@ -389,21 +403,10 @@ docker compose -f ./docker-compose.yml up -d

</details>

### Train with Command Line Interface

See [examples/README.md](examples/README.md) for usage.

> [!TIP]
> Use `llamafactory-cli train -h` to display arguments description.
### Deploy with OpenAI-style API and vLLM

```bash
CUDA_VISIBLE_DEVICES=0,1 API_PORT=8000 llamafactory-cli api \
--model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct \
--template llama3 \
--infer_backend vllm \
--vllm_enforce_eager
CUDA_VISIBLE_DEVICES=0,1 API_PORT=8000 llamafactory-cli api examples/inference/llama3_vllm.yaml
```

### Download from ModelScope Hub
Expand Down
71 changes: 37 additions & 34 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
| [Yuan](https://huggingface.co/IEITYuan) | 2B/51B/102B | q_proj,v_proj | yuan |

> [!NOTE]
> **默认模块**应作为 `--lora_target` 参数的默认值,可使用 `--lora_target all` 参数指定全部模块以得到更好的效果
> **默认模块**应作为 `--lora_target` 参数的默认值,可使用 `--lora_target all` 参数指定全部模块以取得更好的效果
>
> 对于所有“基座”(Base)模型,`--template` 参数可以是 `default`, `alpaca`, `vicuna` 等任意值。但“对话”(Instruct/Chat)模型请务必使用**对应的模板**
>
Expand Down Expand Up @@ -276,18 +276,19 @@ huggingface-cli login
| ------------ | ------- | --------- |
| python | 3.8 | 3.10 |
| torch | 1.13.1 | 2.2.0 |
| transformers | 4.37.2 | 4.39.3 |
| datasets | 2.14.3 | 2.18.0 |
| accelerate | 0.27.2 | 0.28.0 |
| transformers | 4.37.2 | 4.40.1 |
| datasets | 2.14.3 | 2.19.1 |
| accelerate | 0.27.2 | 0.30.0 |
| peft | 0.9.0 | 0.10.0 |
| trl | 0.8.1 | 0.8.1 |
| trl | 0.8.1 | 0.8.6 |

| 可选项 | 至少 | 推荐 |
| ------------ | ------- | --------- |
| CUDA | 11.6 | 12.2 |
| deepspeed | 0.10.0 | 0.14.0 |
| bitsandbytes | 0.39.0 | 0.43.0 |
| flash-attn | 2.3.0 | 2.5.6 |
| bitsandbytes | 0.39.0 | 0.43.1 |
| vllm | 0.4.0 | 0.4.2 |
| flash-attn | 2.3.0 | 2.5.8 |

### 硬件依赖

Expand All @@ -305,24 +306,15 @@ huggingface-cli login

## 如何使用

### 数据准备

关于数据集文件的格式,请参考 [data/README_zh.md](data/README_zh.md) 的内容。你可以使用 HuggingFace / ModelScope 上的数据集或加载本地数据集。

> [!NOTE]
> 使用自定义数据集时,请更新 `data/dataset_info.json` 文件。
### 安装依赖
### 安装 LLaMA Factory

```bash
git clone https://github.com/hiyouga/LLaMA-Factory.git
conda create -n llama_factory python=3.10
conda activate llama_factory
cd LLaMA-Factory
pip install -e .[metrics]
```

可选的额外依赖项:deepspeed、metrics、galore、badam、vllm、bitsandbytes、gptq、awq、aqlm、qwen、modelscope、quality
可选的额外依赖项:metrics、deepspeed、bitsandbytes、vllm、galore、badam、gptq、awq、aqlm、qwen、modelscope、quality

<details><summary>Windows 用户指南</summary>

Expand All @@ -336,19 +328,41 @@ pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/downl

</details>

### 利用 LLaMA Board 可视化界面训练(由 [Gradio](https://github.com/gradio-app/gradio) 驱动)
### 数据准备

关于数据集文件的格式,请参考 [data/README_zh.md](data/README_zh.md) 的内容。你可以使用 HuggingFace / ModelScope 上的数据集或加载本地数据集。

> [!NOTE]
> 使用自定义数据集时,请更新 `data/dataset_info.json` 文件。
### 快速开始

下面三行命令分别对 Llama3-8B-Instruct 模型进行 LoRA **微调****推理****合并**

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
```

高级用法请参考 [examples/README_zh.md](examples/README_zh.md)(包括多 GPU 微调)。

> [!TIP]
> 使用 `llamafactory-cli help` 显示帮助信息。
### 使用 LLaMA Board 可视化界面(由 [Gradio](https://github.com/gradio-app/gradio) 驱动)

> [!IMPORTANT]
> LLaMA Board 可视化界面目前仅支持单 GPU 训练,请使用[命令行接口](#利用命令行接口训练)来进行多 GPU 分布式训练
> LLaMA Board 可视化界面目前仅支持单 GPU 训练。
#### 使用本地环境

```bash
llamafactory-cli webui
CUDA_VISIBLE_DEVICES=0 llamafactory-cli webui
```

> [!TIP]
> 您可以使用环境变量来修改 LLaMA Board 可视化界面的默认设置,例如 `export CUDA_VISIBLE_DEVICES=0 GRADIO_SERVER_NAME=0.0.0.0 GRADIO_SERVER_PORT=7860 GRADIO_SHARE=False`(Windows 系统可使用 `set` 指令)。
> 您可以使用环境变量来修改 LLaMA Board 可视化界面的默认设置,例如 `export GRADIO_SERVER_NAME=0.0.0.0 GRADIO_SERVER_PORT=7860 GRADIO_SHARE=False`(Windows 系统可使用 `set` 指令)。
<details><summary>阿里云用户指南</summary>

Expand Down Expand Up @@ -389,21 +403,10 @@ docker compose -f ./docker-compose.yml up -d

</details>

### 利用命令行接口训练

使用方法请参考 [examples/README_zh.md](examples/README_zh.md)

> [!TIP]
> 您可以执行 `llamafactory-cli train -h` 来查看参数文档。
### 利用 vLLM 部署 OpenAI API

```bash
CUDA_VISIBLE_DEVICES=0,1 API_PORT=8000 llamafactory-cli api \
--model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct \
--template llama3 \
--infer_backend vllm \
--vllm_enforce_eager
CUDA_VISIBLE_DEVICES=0,1 API_PORT=8000 llamafactory-cli api examples/inference/llama3_vllm.yaml
```

### 从魔搭社区下载
Expand Down
2 changes: 1 addition & 1 deletion data/dataset_info.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
},
"identity": {
"file_name": "identity.json",
"file_sha1": "ffe3ecb58ab642da33fbb514d5e6188f1469ad40"
"file_sha1": "0f67e97fd01612006ab3536cdaf6cfb0d1e7f279"
},
"oaast_sft": {
"file_name": "oaast_sft.json",
Expand Down
Loading

0 comments on commit c8cd00b

Please sign in to comment.