Maybe lower default temp and switch to top_k 40 #42

bakkot · 2023-03-12T10:12:43Z

Per this twitter thread. See commit here.

G2G2G2G · 2023-03-12T12:50:44Z

--top_k N top-k sampling (default: 40)

Piezoid · 2023-03-12T13:41:02Z

AFAIK, there is no top k filtering in the current version. The main code uses the llama_sample_top_p, and not gpt_sample_top_k_top_p which is the only piece of code that actually uses the top_k parameter.

The repetition penalty could maybe be ported to this sampler and used instead?

I've seen multiple people reporting that FB's default sampler is not adequate for comparing LLaMA's outputs with davinci's. Thus, enabling top k filtering could allow people to experiment with and compare different sampling strategies.

sswam · 2023-03-12T16:43:12Z

It does seem to work much better with these options, based on shawwn's patch:
--temp 0.7 --top_k 40 --top_p 0 --repeat_last_n 256 --repeat_penalty 1.1764705882352942

I'm not sure what value is ideal for repeat_last_n, but with a little testing, 256 seems to be enough, while 128 wasn't.

Piezoid · 2023-03-12T17:52:49Z

--temp 0.7 --top_k 40 --top_p 0 --repeat_last_n 256 --repeat_penalty 1.1764705882352942

llama.cpp with --top_p 0 is greedy inference, picking the highest probability token.

beiller · 2023-03-12T20:07:46Z

FYI Top K isn't used this PR tho should fix it:
#56

ggerganov · 2023-03-13T17:26:16Z

Need a better strategy to determine default parameters.
Single examples do not show anything of value

ggerganov closed this as completed Mar 13, 2023

gjmulder added the generation quality Quality of model output label Mar 15, 2023

Hades32 pushed a commit to Hades32/llama.cpp that referenced this issue Mar 21, 2023

Merge pull request ggerganov#42 from MariusCiocanel/master

3a208b9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maybe lower default temp and switch to top_k 40 #42

Maybe lower default temp and switch to top_k 40 #42

bakkot commented Mar 12, 2023

G2G2G2G commented Mar 12, 2023

Piezoid commented Mar 12, 2023

sswam commented Mar 12, 2023

Piezoid commented Mar 12, 2023 •

edited

Loading

beiller commented Mar 12, 2023

ggerganov commented Mar 13, 2023

Maybe lower default temp and switch to top_k 40 #42

Maybe lower default temp and switch to top_k 40 #42

Comments

bakkot commented Mar 12, 2023

G2G2G2G commented Mar 12, 2023

Piezoid commented Mar 12, 2023

sswam commented Mar 12, 2023

Piezoid commented Mar 12, 2023 • edited Loading

beiller commented Mar 12, 2023

ggerganov commented Mar 13, 2023

Piezoid commented Mar 12, 2023 •

edited

Loading