Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Commit

Permalink
Merge pull request #29 from darthdeus/repl
Browse files Browse the repository at this point in the history
Add basic alpaca REPL mode
  • Loading branch information
setzer22 authored Mar 21, 2023
2 parents d8ca18d + 779385d commit 6403f09
Show file tree
Hide file tree
Showing 8 changed files with 323 additions and 16 deletions.
228 changes: 228 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

25 changes: 17 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,6 @@ Make sure you have a rust toolchain set up.
3. Build (`cargo build --release`)
4. Run with `cargo run --release -- <ARGS>`

Some additional things to try:

- Use `--help` to see a list of available options.
- Prompt files can be precomputed to speed up processing using the
`--cache-prompt` and `--restore-prompt` flags.

[^1]: The only legal source to get the weights at the time of writing is [this repository](https://github.com/facebookresearch/llama/blob/main/README.md#llama). The choice of words also may or may not hint at the existence of other kinds of sources.

**NOTE**: For best results, make sure to build and run in release mode. Debug builds are going to be very slow.

For example, you try the following prompt:
Expand All @@ -48,6 +40,23 @@ For example, you try the following prompt:
cargo run --release -- -m /data/Llama/LLaMA/7B/ggml-model-q4_0.bin -p "Tell me how cool the Rust programming language is:"
```

Some additional things to try:

- Use `--help` to see a list of available options.
- If you have the [alpaca-lora](https://github.com/tloen/alpaca-lora) weights,
try `--repl` mode! `cargo run --release -- -m <path>/ggml-alpaca-7b-q4.bin
-f examples/alpaca_prompt.txt --repl`.

![Gif showcasing alpaca repl mode](./doc/resources/alpaca_repl_screencap.gif)

- Prompt files can be precomputed to speed up processing using the
`--cache-prompt` and `--restore-prompt` flags so you can save processing time
for lengthy prompts.

![Gif showcasing prompt caching](./doc/resources/prompt_caching_screencap.gif)

[^1]: The only legal source to get the weights at the time of writing is [this repository](https://github.com/facebookresearch/llama/blob/main/README.md#llama). The choice of words also may or may not hint at the existence of other kinds of sources.

## Q&A

- **Q: Why did you do this?**
Expand Down
Binary file added doc/resources/alpaca_repl_screencap.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/resources/prompt_caching_screencap.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 7 additions & 0 deletions examples/alpaca_prompt.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:

$PROMPT

### Response:
5 changes: 4 additions & 1 deletion llama-cli/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,7 @@ num_cpus = "1.15.0"

llama-rs = { path = "../llama-rs" }

rand = { workspace = true }
rand = { workspace = true }

rustyline = "11.0.0"
spinners = "4.1.0"
4 changes: 4 additions & 0 deletions llama-cli/src/cli_args.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ pub struct Args {
#[arg(long, short = 'f', default_value = None)]
pub prompt_file: Option<String>,

/// Run in REPL mode.
#[arg(long, short = 'R', default_value_t = false)]
pub repl: bool,

/// Sets the number of threads to use
#[arg(long, short = 't', default_value_t = num_cpus::get_physical())]
pub num_threads: usize,
Expand Down
Loading

0 comments on commit 6403f09

Please sign in to comment.