[model support] Requesting support for Gemma 2 #10

sand-bit · 2024-07-28T15:20:26Z

Requesting support for Gemma 2 27B/9B.

Sliding window attention, used every other layer
Logit soft capping (aka soft capping, so no flash attention support right now)
Possible tokenizer bugs in transformer implementation Investigate gemma 2 generation quality ggerganov/llama.cpp#8240 (comment) (See that issue for other discussions on tokenizer bugs. Maybe okay in Google's official Keras implementation?)

james0zan · 2024-07-29T07:40:03Z

As it is a dense model, we may currently not be able to run Gemma 2 faster than its official support. Thus, Gemma's support may not be available in the coming days.

james0zan closed this as completed Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[model support] Requesting support for Gemma 2 #10

[model support] Requesting support for Gemma 2 #10

sand-bit commented Jul 28, 2024

james0zan commented Jul 29, 2024

[model support] Requesting support for Gemma 2 #10

[model support] Requesting support for Gemma 2 #10

Comments

sand-bit commented Jul 28, 2024

james0zan commented Jul 29, 2024