Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : add Phi-4-mini support (supersede #12099) #12108

Merged
merged 7 commits into from
Feb 28, 2025
Merged

llama : add Phi-4-mini support (supersede #12099) #12108

merged 7 commits into from
Feb 28, 2025

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Feb 28, 2025

Supersede #12099 with these changes:

  1. No need to check for longrope when loading tensors ==> same with LLM_ARCH_LLAMA
  2. Fix convert_hf_to_gguf_update.py
  3. Add missing tokenizer .inp/.out files

Test with llama-cli:

You are a helpful assistant

> hi
Hello! How can I assist you today?

> write some unicode
Sure! Unicode characters can represent a wide array of symbols from various languages, mathematical operators, and much more. Here are some examples:

- Smile face: 😊
- Heart: ❤️
- Star: ⭐
- Music notes: 🎵
- Infinity: ∞
- Copyright: ©
- Mathematical pi: π
- Mathematical integral: ∫
- Mathematical infinity: ∞
- Euro sign: €
- Square root: √
- Copyright symbol: ℧
- Greek letter alpha: α

Would you like to see more examples, or is there a specific character or symbol you're interested in?

> write chinese characters
Certainly! Chinese characters, also known as Hanzi in Mandarin, are logograms used in the Chinese writing system. Here are a few examples of Chinese characters, along with their pinyin (Romanization) and English meanings:

1. 爱 (Ài) - Love
2. 和 (Hé) - Harmony, and
3. 人 (Rén) - Person
4. 学 (Xué) - Study, learn, or education
5. 水 (Shū) - Water
6. 火 (Huǒ) - Fire
7. 木 (Mù) - Wood
8. 金 (Jīn) - Gold
9. 土 (Tǔ) - Soil
10. 气 (Qì) - Air, atmosphere

Each Chinese character can have a specific meaning and sometimes multiple meanings depending on the context. Learning Chinese characters can be quite rewarding, as it allows for a deeper understanding of Chinese culture and language nuances. Would you like more examples or help with something else?

> 
llama_perf_sampler_print:    sampling time =      34.91 ms /   219 runs   (    0.16 ms per token,  6272.38 tokens per second)
llama_perf_context_print:        load time =    2003.88 ms
llama_perf_context_print: prompt eval time =    6230.83 ms /    42 tokens (  148.35 ms per token,     6.74 tokens per second)
llama_perf_context_print:        eval time =    6517.55 ms /   343 runs   (   19.00 ms per token,    52.63 tokens per second)
llama_perf_context_print:       total time =   21252.44 ms /   385 tokens
Interrupted by user

@ngxson ngxson requested a review from ggerganov February 28, 2025 09:52
@github-actions github-actions bot added the python python script changes label Feb 28, 2025
@ngxson ngxson merged commit c43a3e7 into master Feb 28, 2025
52 checks passed
@ericcurtin ericcurtin deleted the xsn/phi-4 branch February 28, 2025 12:43
@bartowski1182
Copy link
Contributor

Using latest llama.cpp release and converting Phi 4 mini instruct gave me this error:

kv_bytes += self._pack_val(val.value, val.type, add_vtype=True)
  File "/llama.cpp/gguf-py/gguf/gguf_writer.py", line 945, in _pack_val
    raise ValueError("All items in a GGUF array should be of the same type")
ValueError: All items in a GGUF array should be of the same type

Animaxx added a commit to Animaxx/llama.cpp that referenced this pull request Mar 2, 2025
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
…2108)

* Added Phi-4-mini-instruct support

* Update regex per ngxson

* Change the vocab base to Xenova/gpt-4o

* fix conversion update script

* no need to check longrope

* minor style fix

* fix python style

---------

Co-authored-by: Nicholas Sparks <nisparks@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants