Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Chat] Add Chat from TRL 🐈 #35714

Merged
merged 7 commits into from
Jan 22, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
193 changes: 100 additions & 93 deletions docs/source/en/chat_templating.md

Large diffs are not rendered by default.

7 changes: 7 additions & 0 deletions docs/source/en/generation_strategies.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,13 @@ This guide describes:
* common decoding strategies and their main parameters
* saving and sharing custom generation configurations with your fine-tuned model on 🤗 Hub

<Tip>

`generate()` is a critical component of our [`transformers-cli chat` CLI](quicktour#chat-with-text-generation-models).
You can apply the learnings of this guide there as well.

</Tip>

## Default text generation configuration

A decoding strategy for a model is defined in its generation configuration. When using pre-trained models for inference
Expand Down
6 changes: 6 additions & 0 deletions docs/source/en/llm_tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,12 @@ LLMs, or Large Language Models, are the key component behind text generation. In

Autoregressive generation is the inference-time procedure of iteratively calling a model with its own generated outputs, given a few initial inputs. In 🤗 Transformers, this is handled by the [`~generation.GenerationMixin.generate`] method, which is available to all models with generative capabilities.

<Tip>

If you want to jump straight to chatting with a model, [try our `transformers-cli chat` CLI](quicktour#chat-with-text-generation-models).

</Tip>

This tutorial will show you how to:

* Generate text with an LLM
Expand Down
26 changes: 26 additions & 0 deletions docs/source/en/quicktour.md
Original file line number Diff line number Diff line change
Expand Up @@ -553,6 +553,32 @@ All models are a standard [`tf.keras.Model`](https://www.tensorflow.org/api_docs
>>> model.fit(tf_dataset) # doctest: +SKIP
```


## Chat with text generation models

If you're working with a model that generates text as an output, you can also engage in a multi-turn conversation with
it through the `transformers-cli chat` command. This is the fastest way to interact with a model, e.g. for a
qualitative assessment (aka vibe check).

This CLI is implemented on top of our `AutoClass` abstraction, leveraging our [text generation](llm_tutorial.md) and
[chat](chat_templating.md) tooling, and thus will be compatible with any 🤗 Transformers model. If you have the library
[installed](installation.md), you can launch the chat session on your terminal with

```
transformers-cli chat --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct
```

For a full list of options to launch the chat, type

```
transformers-cli chat -h
```

After the chat is launched, you will enter an interactive session with the model. There are special commands for this
session as well, such as `clear` to reset the conversation. Type `help` at any moment to display all special chat
commands, and `exit` to terminate the session.


## What's next?

Now that you've completed the 🤗 Transformers quick tour, check out our guides and learn how to do more specific things like writing a custom model, fine-tuning a model for a task, and how to train a model with a script. If you're interested in learning more about 🤗 Transformers core concepts, grab a cup of coffee and take a look at our Conceptual Guides!
Loading