Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maybe lower default temp and switch to top_k 40 #42

Closed
bakkot opened this issue Mar 12, 2023 · 6 comments
Closed

Maybe lower default temp and switch to top_k 40 #42

bakkot opened this issue Mar 12, 2023 · 6 comments
Labels
generation quality Quality of model output

Comments

@bakkot
Copy link
Contributor

bakkot commented Mar 12, 2023

Per this twitter thread. See commit here.

@G2G2G2G
Copy link

G2G2G2G commented Mar 12, 2023

--top_k N top-k sampling (default: 40)

@Piezoid
Copy link
Contributor

Piezoid commented Mar 12, 2023

AFAIK, there is no top k filtering in the current version. The main code uses the llama_sample_top_p, and not gpt_sample_top_k_top_p which is the only piece of code that actually uses the top_k parameter.

The repetition penalty could maybe be ported to this sampler and used instead?

I've seen multiple people reporting that FB's default sampler is not adequate for comparing LLaMA's outputs with davinci's. Thus, enabling top k filtering could allow people to experiment with and compare different sampling strategies.

@sswam
Copy link

sswam commented Mar 12, 2023

It does seem to work much better with these options, based on shawwn's patch:
--temp 0.7 --top_k 40 --top_p 0 --repeat_last_n 256 --repeat_penalty 1.1764705882352942

I'm not sure what value is ideal for repeat_last_n, but with a little testing, 256 seems to be enough, while 128 wasn't.

@Piezoid
Copy link
Contributor

Piezoid commented Mar 12, 2023

--temp 0.7 --top_k 40 --top_p 0 --repeat_last_n 256 --repeat_penalty 1.1764705882352942

llama.cpp with --top_p 0 is greedy inference, picking the highest probability token.

@beiller
Copy link
Contributor

beiller commented Mar 12, 2023

FYI Top K isn't used this PR tho should fix it:
#56

@ggerganov
Copy link
Owner

Need a better strategy to determine default parameters.
Single examples do not show anything of value

@gjmulder gjmulder added the generation quality Quality of model output label Mar 15, 2023
Hades32 pushed a commit to Hades32/llama.cpp that referenced this issue Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
generation quality Quality of model output
Projects
None yet
Development

No branches or pull requests

7 participants