The initial token is always empty. #367

BadisG · 2023-03-21T19:48:37Z

Hello,

I noticed something when trying the chat with Bob is that I always get the first token as empty.

 1 -> ''

4103 -> ' Trans'
924 -> 'cript'
310 -> ' of'
263 -> ' a'
7928 -> ' dialog'

So the result is this:

There's this little space at the begining of the text. Maybe this alone can significantly impact the quality of the output, that's why I decided to post this issue.

I'm on a windows 10 using WSL to emulate the linux environnement (the main.exe is not as good as the linux main atm).

I'm using a file that is the result of all those manipulations:

I have first a llama-7b-4bit.pt file
I converted it with the gptq-to-ggml converter (convert-gptq-to-ggml.py)
I converted it again into the new version of ggml with this script Breaking change of models since PR #252 #324 (comment)

Here's the .sh command (7B_CHAT_Bob.sh):

#!/bin/bash
dos2unix 7B_CHAT_Bob.sh

./main -m ./models/llama7b-4bit-GPTQ.bin -t 14 -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt

Everything is updated on this repository as I apply a git pull everytime I launch the powershell.

The text was updated successfully, but these errors were encountered:

gjmulder · 2023-03-21T19:54:37Z

Please review the issue reporting guidelines in #239 and provide a better description of the issue you are observing.

BadisG · 2023-03-21T20:12:47Z

Please review the issue reporting guidelines in #239 and provide a better description of the issue you are observing.

I added more details based on your guideline, I hope that'll help

PriNova · 2023-03-21T20:20:26Z

Hello,

I noticed something when trying the chat with Bob is that I always get the first token as empty.
 1 -> ''
4103 -> ' Trans' 924 -> 'cript' 310 -> ' of' 263 -> ' a' 7928 -> ' dialog'

So the result is this:

()Transcript of a dialog, where the User...

There's this little space at the begining of the text. Maybe this alone can significantly impact the quality of the output, that's why I decided to post this issue.

I'm on a windows 10 using WSL to emulate the linux environnement (the main.exe is not as good as the linux main atm).

I'm using a file that is the result of all those manipulations:

I have first a llama-7b-4bit.pt

I converted it with the gptq-to-ggml converter (convert-gptq-to-ggml.py)

I converted it again into the new version of ggml with this script Breaking change of models since PR #252 #324 (comment)

Here's the .sh command (7B_CHAT_Bob.sh):
#!/bin/bash
dos2unix 7B_CHAT_Bob.sh

./main -m ./models/llama7b-4bit-GPTQ.bin -t 14 -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt
Everything is updated on this repository as I apply a git pull everytime I launch the powershell.

The Token with ID 1 is a custom token called BOD (Begin Of Document) and is one of the two tokens which are required in the token vocabulary. The second is EOD (End Of Document) with ID 2.

So to say, this is a normal behaviour.

BadisG · 2023-03-21T20:24:44Z

@PriNova I see, thanks for your answer I learned something today!
But still I can see a space at the begining of the text, I think I hadn't that before, it's a bit ugly to look at... but if it doesn't change the output I'm ok with that.

mattsta · 2023-03-22T05:22:51Z

You can make token 1 go away by commenting out in utils.cpp llama_tokenize():

    if (bos) {
        // output.push_back(1);
    }

It's probably more correct with it there, but also doesn't seem to break anything if removed (if only submitting one whole document per session at least).

As for the leading space, look at your initial tokens above of:

4103 -> ' Trans'
924 -> 'cript'

The space is inside the first token, so it is being printed. Technically if the first token starts with a space the output could skip over it when printing.

Green-Sky · 2023-03-22T10:56:45Z

The leading space is intentional and a result of

llama.cpp/main.cpp

Lines 232 to 233 in d5850c5

    
           // Add a space in front of the first character to match OG llama tokenizer behavior 
        
           params.prompt.insert(0, 1, ' ');

not not sure if we should just not print the first character (the space) or not.

…-install-md-docs Ianscrivener macos install md docs

github-actions · 2024-04-10T01:08:02Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

gjmulder added question Further information is requested generation quality Quality of model output need more info The OP should provide more details about the issue labels Mar 21, 2023

gjmulder removed the question Further information is requested label Mar 21, 2023

Deadsg pushed a commit to Deadsg/llama.cpp that referenced this issue Dec 19, 2023

Merge pull request ggerganov#367 from ianscrivener/ianscrivener-macos…

abf6d4a

…-install-md-docs Ianscrivener macos install md docs

github-actions bot added the stale label Mar 25, 2024

Propheticus mentioned this issue Mar 31, 2024

bug: responses from /chat/completions endpoint contain a leading space in the content janhq/jan#2548

Closed

github-actions bot closed this as completed Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The initial token is always empty. #367

The initial token is always empty. #367

BadisG commented Mar 21, 2023 •

edited

Loading

gjmulder commented Mar 21, 2023

BadisG commented Mar 21, 2023

PriNova commented Mar 21, 2023

BadisG commented Mar 21, 2023 •

edited

Loading

mattsta commented Mar 22, 2023

Green-Sky commented Mar 22, 2023

github-actions bot commented Apr 10, 2024

The initial token is always empty. #367

The initial token is always empty. #367

Comments

BadisG commented Mar 21, 2023 • edited Loading

gjmulder commented Mar 21, 2023

BadisG commented Mar 21, 2023

PriNova commented Mar 21, 2023

BadisG commented Mar 21, 2023 • edited Loading

mattsta commented Mar 22, 2023

Green-Sky commented Mar 22, 2023

github-actions bot commented Apr 10, 2024

BadisG commented Mar 21, 2023 •

edited

Loading

BadisG commented Mar 21, 2023 •

edited

Loading