Skip to content

Commit

Permalink
updated README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
cahya-wirawan committed Sep 4, 2024
1 parent a2bc33f commit 1337ec2
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ tokenizer is around 17x faster than the original tokenizer and 9.6x faster than

![performance-comparison](data/performance-comparison.png)

We updated the Rust RWKV world tokenizer to support multithreading for batch encoding. We ran the same comparison
We updated the Rust RWKV world tokenizer to support batch encoding with multithreading. We ran the same comparison
[script](tools/test_tiktoken-huggingface-rwkv.py) from the [Huggingface Tokenizers](https://github.com/huggingface/tokenizers)
with the additional rwkv tokenizer. The result shows that the rwkv world tokenizer is significantly faster than
the Tiktoken and Huggingface tokenizers in all numbers of threads and document sizes (on average, its speed is ten times faster).
Expand Down

0 comments on commit 1337ec2

Please sign in to comment.