Skip to content

Commit

Permalink
Update doc
Browse files Browse the repository at this point in the history
  • Loading branch information
kwen2501 committed Jan 23, 2025
1 parent 48b12ba commit b2f7e97
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions docs/source/en/perf_infer_gpu_multi.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,13 @@ torchrun --nproc-per-node 4 demo.py

PyTorch tensor parallel is currently supported for the following models:
* [Llama](https://huggingface.co/docs/transformers/model_doc/llama#transformers.LlamaModel)
* [Gemma](https://huggingface.co/docs/transformers/en/model_doc/gemma)
* [Granite](https://huggingface.co/docs/transformers/en/model_doc/granite)
* [Mistral](https://huggingface.co/docs/transformers/en/model_doc/mistral)
* [Qwen2](https://huggingface.co/docs/transformers/en/model_doc/qwen2)
* [Qwen2MoE](https://huggingface.co/docs/transformers/en/model_doc/qwen2_moe)
* [Qwen2-VL](https://huggingface.co/docs/transformers/v4.48.0/en/model_doc/qwen2_vl)
* [Starcoder2](https://huggingface.co/docs/transformers/en/model_doc/starcoder2)

You can request to add tensor parallel support for another model by opening a GitHub Issue or Pull Request.

Expand Down

0 comments on commit b2f7e97

Please sign in to comment.