Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars #9639

Merged
merged 375 commits into from
Jan 30, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
375 commits
Select commit Hold shift + click to select a range
ec9f3b1
nits
ochafik Oct 27, 2024
9a86ea7
`tool-call`: slow tool call integration tests
ochafik Oct 28, 2024
c88095e
space nits
ochafik Oct 28, 2024
7fde6d0
`tool_call`: test no tool call on a real model + rename scenarios
ochafik Oct 28, 2024
dd6d024
`tool-call`: script to prefetch models used in server tests
ochafik Oct 28, 2024
168add7
Update tool_call.feature
ochafik Oct 28, 2024
ec547e4
`tool-call`: add tests: tool_call=none, parallel_tool_calls=true
ochafik Oct 28, 2024
b51c71c
`tool-call`: remove duplicate script to fetch templates
ochafik Oct 28, 2024
74d71a6
`agent`: simplify syntax (default tools to local w/ default port)
ochafik Oct 28, 2024
b825440
`tool-call`: use Q4_K_M models
ochafik Oct 28, 2024
aefac1e
`tool-call`: update scripts/fetch_server_test_models.py
ochafik Oct 28, 2024
64287a3
`tool-call`: test Hermes-3-Llama-3.1-8B
ochafik Oct 29, 2024
fa4c111
`tool-call`: use functionary-small-v3.2-Q8_0.gguf in test (Q4_K_M too…
ochafik Oct 29, 2024
773ff91
`tool-call`: force printing of lazy grammar trigger tokens to regular…
ochafik Oct 29, 2024
92c384a
nits
ochafik Oct 29, 2024
3ebdb2b
`tool-call`: support tool_use variant in llama_chat_template_from_mod…
ochafik Oct 30, 2024
35ac17f
`tool-call`: fix missing initializer errors
ochafik Oct 30, 2024
5227321
`tool-call`: when slow server tests fail, hint to run `python scripts…
ochafik Oct 30, 2024
e4d5449
`tool-calls`: test Qwen2.5-7B-Instruct-Q4_K_M.gguf
ochafik Oct 30, 2024
61655b9
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Oct 31, 2024
be9de3e
Update llama-sampling.cpp
ochafik Oct 31, 2024
542853b
`tool-call`: greedy sampling in server tests + tweak prompt
ochafik Oct 31, 2024
7d9c90f
`tool-call`: nemo tweak (accept raw sql again)
ochafik Oct 31, 2024
e8d9d71
Update tool_call.feature
ochafik Oct 31, 2024
c395d48
`tool-call`: behaviour-based detection of template features
ochafik Oct 31, 2024
f5b7825
`tool-call`: code_interpreter & system + tool call support for all ji…
ochafik Oct 31, 2024
c773516
`tool-call`: don't use -fa w/ Mistral-Nemo (hard crashes?)
ochafik Oct 31, 2024
b35aa4a
`tool-call`: add LLAMA_UPDATE_GOLDENS env for test-chat-template
ochafik Oct 31, 2024
9477c54
`tool-call`: functionary-small-v3.2 test now green
ochafik Oct 31, 2024
c4a8050
Update README.md
ochafik Oct 31, 2024
f5f7475
nits
ochafik Oct 31, 2024
fe967b6
Update README.md
ochafik Oct 31, 2024
479c152
`tool-call`: fix qwen template test
ochafik Oct 31, 2024
bc52c0a
`agent`: add missing tool name in response!
ochafik Oct 31, 2024
c059aec
`agent`: memorize, search_memory (sqlite-vec + sqlite-lembed), fetch …
ochafik Nov 9, 2024
5789f69
`minja`: don't explode upon referencing a field on an array (fixes He…
ochafik Nov 9, 2024
f9b1969
Update README.md
ochafik Nov 9, 2024
adc673c
agent: add --think "tool", default to local tools endpoint, support -…
ochafik Dec 5, 2024
1afa312
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Dec 6, 2024
30fbcb2
agent: more robust squid config
ochafik Dec 6, 2024
a469f53
agent: update readme
ochafik Dec 6, 2024
cbe395d
minja: remove tests (now in https://github.com/google/minja)
ochafik Dec 6, 2024
1fd5f1a
Update README.md
ochafik Dec 6, 2024
5d0033f
minja: sync @ https://github.com/google/minja/commit/916c181c0d4a6f96…
ochafik Dec 7, 2024
1f0b157
tool-call: add firefunction-v2 style
ochafik Dec 7, 2024
93a5245
tool-calls: migrate tests to pytest
ochafik Dec 10, 2024
055053c
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Dec 14, 2024
1e2115f
tool-calls: shorter name: grammar_triggers
ochafik Dec 14, 2024
7bfcd0a
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Dec 14, 2024
7e3feff
tool-call: stabilize server tests
ochafik Dec 15, 2024
e70ce3f
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Dec 26, 2024
f0bd693
Update test-tool-call.cpp
ochafik Dec 26, 2024
f645887
Update minja.hpp https://github.com/google/minja/commit/202aa2f3de21b…
ochafik Dec 26, 2024
0e87ae2
rm trailing spaces
ochafik Dec 27, 2024
0a5d527
Update fetch_server_test_models.py
ochafik Dec 27, 2024
a2fe8a4
Fix tool-call server tests
ochafik Dec 27, 2024
523ebf8
Simplify tool call grammars when there's only 1 tool
ochafik Dec 27, 2024
abd274a
Copy minja from https://github.com/google/minja/commit/58f0ca6dd74bcb…
ochafik Dec 30, 2024
e5113e8
Add --jinja and --chat-template-file flags
ochafik Dec 30, 2024
80138d9
Add missing <optional> include
ochafik Dec 30, 2024
06b5159
Avoid print in get_hf_chat_template.py
ochafik Dec 30, 2024
ce48584
No designated initializers yet
ochafik Dec 30, 2024
389d79b
Try and work around msvc++ non-macro max resolution quirk
ochafik Dec 30, 2024
238b968
Update test_chat_completion.py
ochafik Dec 30, 2024
cb72cf1
Merge remote-tracking branch 'origin/master' into jinja
ochafik Jan 13, 2025
78861a3
Wire LLM_KV_TOKENIZER_CHAT_TEMPLATE_N in llama_model_chat_template
ochafik Jan 13, 2025
1aac99a
Refactor test-chat-template
ochafik Jan 13, 2025
7c84ebc
Test templates w/ minja
ochafik Jan 13, 2025
18f257b
Fix deprecation
ochafik Jan 13, 2025
8dd4f33
Add --jinja to llama-run
ochafik Jan 13, 2025
c04c50e
Merge remote-tracking branch 'origin/master' into jinja
ochafik Jan 13, 2025
a6afb27
Update common_chat_format_example to use minja template wrapper
ochafik Jan 13, 2025
b4083e4
Test chat_template in e2e test
ochafik Jan 13, 2025
b7e2171
Update utils.py
ochafik Jan 13, 2025
a57bb94
Update test_chat_completion.py
ochafik Jan 13, 2025
4daae0b
Update run.cpp
ochafik Jan 13, 2025
1b3bb7e
Update arg.cpp
ochafik Jan 14, 2025
e7ff6ec
Merge branch 'jinja' into tool-call
ochafik Jan 14, 2025
7a7d6f6
Fix merge
ochafik Jan 14, 2025
e183fa9
Update test-chat-template.cpp
ochafik Jan 14, 2025
010726c
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Jan 14, 2025
d47f40c
Update test-chat-template.cpp
ochafik Jan 14, 2025
3ed670b
Merge remote-tracking branch 'origin/master' into jinja
ochafik Jan 14, 2025
3c7784c
Refactor common_chat_* functions to accept minja template + use_jinja…
ochafik Jan 18, 2025
b75d062
Refactor common_chat_* functions to accept minja template + use_jinja…
ochafik Jan 18, 2025
40db789
Merge remote-tracking branch 'origin/master' into jinja
ochafik Jan 18, 2025
81c0d43
Attempt to fix linkage of LLAMA_CHATML_TEMPLATE
ochafik Jan 18, 2025
138a4ba
Merge branch 'jinja' into tool-call
ochafik Jan 18, 2025
d5fa351
Revert LLAMA_CHATML_TEMPLATE refactor
ochafik Jan 18, 2025
045edd1
Merge branch 'jinja' into tool-call
ochafik Jan 18, 2025
2ceabee
Fix fetch_server_test_models.py (avoid conv trap)
ochafik Jan 18, 2025
259d9e4
tools: greedy sampling in tests
ochafik Jan 18, 2025
acf7c24
tools: run tool call slow tests when SLOW_TESTS=1 (+ prefetch models)
ochafik Jan 18, 2025
ee1e10e
Normalize newlines in test-chat-templates for windows tests
ochafik Jan 18, 2025
e63520f
Forward decl minja::chat_template to avoid eager json dep
ochafik Jan 18, 2025
33322e8
Flush stdout in chat template before potential crash
ochafik Jan 18, 2025
5074e6f
Fix copy elision warning
ochafik Jan 18, 2025
76893f5
Merge branch 'jinja' into tool-call
ochafik Jan 18, 2025
fc60802
Rm unused optional include
ochafik Jan 18, 2025
0e74c9d
Add missing optional include to server.cpp
ochafik Jan 18, 2025
d6f058d
Merge branch 'jinja' into tool-call
ochafik Jan 18, 2025
e3c475c
Disable jinja test that has a cryptic windows failure
ochafik Jan 18, 2025
cc50356
minja: fix vigogne (https://github.com/google/minja/pull/22)
ochafik Jan 18, 2025
c207fdc
Merge branch 'jinja' into tool-call
ochafik Jan 18, 2025
0401a83
agent: add --greedy, --top-p, --top-k options
ochafik Jan 19, 2025
153e852
Apply suggestions from code review
ochafik Jan 20, 2025
db9dd0c
Finish suggested renamings
ochafik Jan 20, 2025
c9e8fdd
Move chat_templates inside server_context + remove mutex
ochafik Jan 20, 2025
8c84aef
Update --chat-template-file w/ recent change to --chat-template
ochafik Jan 20, 2025
154bfaa
Refactor chat template validation
ochafik Jan 20, 2025
099f983
Merge remote-tracking branch 'origin/master' into jinja
ochafik Jan 20, 2025
54a669e
Guard against missing eos/bos tokens (null token otherwise throws in …
ochafik Jan 20, 2025
8348c60
Warn against missing eos / bos tokens when jinja template references …
ochafik Jan 20, 2025
ee475d2
rename: common_chat_template[s]
ochafik Jan 20, 2025
8a7c89e
reinstate assert on chat_templates.template_default
ochafik Jan 20, 2025
9bab693
Merge branch 'jinja' into tool-call
ochafik Jan 20, 2025
b110374
apply renames from jinja branch
ochafik Jan 20, 2025
8347da9
Update minja to https://github.com/google/minja/commit/b8437df626ac6c…
ochafik Jan 20, 2025
7ea6a06
Merge branch 'jinja' into tool-call
ochafik Jan 20, 2025
56aa93c
fix std imports for gcc build
ochafik Jan 21, 2025
ff2cce5
Update minja to https://github.com/google/minja/pull/25
ochafik Jan 21, 2025
ba8dd66
Merge branch 'jinja' into tool-call
ochafik Jan 21, 2025
9d8ebd6
Update minja from https://github.com/google/minja/pull/27
ochafik Jan 21, 2025
c606255
Merge branch 'jinja' into tool-call
ochafik Jan 21, 2025
fec0260
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Jan 21, 2025
b49d052
rm tests/test-minja from makefile
ochafik Jan 21, 2025
f6e73da
Remove examples/agent (moved to https://gist.github.com/ochafik/9246d…
ochafik Jan 21, 2025
77f4098
Delete update_jinja_goldens.py
ochafik Jan 21, 2025
dbf841b
Push laziness down to grammar impl
ochafik Jan 22, 2025
ef61a4c
minimize diffs
ochafik Jan 22, 2025
3972945
common_tool_call rename
ochafik Jan 22, 2025
d77fecc
shrink diff in json conversion code
ochafik Jan 22, 2025
5268ec8
Refactor string helpers into common
ochafik Jan 22, 2025
9e8b43f
follow enum naming style for tool call styles
ochafik Jan 22, 2025
9a5acbb
Factor string_join, string_split, string_repeat into common
ochafik Jan 22, 2025
4de5cf8
json: refactor to surface a versatile builder
ochafik Jan 22, 2025
03fe80f
drop unused fs_list_files
ochafik Jan 22, 2025
41a613b
Merge branch 'string_utils' into tool-call
ochafik Jan 22, 2025
5140d7a
Update common.cpp
ochafik Jan 22, 2025
e211629
Merge branch 'string_utils' into tool-call
ochafik Jan 22, 2025
28cac49
drop llama_sampler_accept_str
ochafik Jan 22, 2025
2dd09c7
more cleanups
ochafik Jan 22, 2025
01b345b
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Jan 22, 2025
82b6e9a
merge common_tool_calls into common_chat_msg
ochafik Jan 22, 2025
63387c6
smaller diff
ochafik Jan 22, 2025
a422636
nits
ochafik Jan 22, 2025
cce1166
Update tool-call.cpp
ochafik Jan 22, 2025
c6a22ed
Greedy sampling in tool call tests
ochafik Jan 22, 2025
30d33d9
Update test_chat_completion.py
ochafik Jan 22, 2025
9ccc62b
Sync minja after https://github.com/google/minja/pull/29
ochafik Jan 22, 2025
d186721
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Jan 22, 2025
f0231a5
fix common_chat_msg invocations
ochafik Jan 22, 2025
5e358ad
fix msg init warning
ochafik Jan 22, 2025
cdfa8b9
Update chat-template.hpp
ochafik Jan 22, 2025
a46de6a
Add grammar options + rename builder to common_grammar_builder
ochafik Jan 22, 2025
c2d836f
Update real tool call tests (use less models)
ochafik Jan 22, 2025
46415d7
Fix lazy trigger handling
ochafik Jan 22, 2025
36ed106
WIP chat handlers
ochafik Jan 24, 2025
c479d39
tool-call: allow special tokens that are grammar triggers
ochafik Jan 25, 2025
0208b20
Update test_chat_completion.py
ochafik Jan 25, 2025
a6463c1
jinja: don't add bos when jinja enabled
ochafik Jan 25, 2025
51b7aab
Update test_chat_completion.py
ochafik Jan 25, 2025
3f3fc03
nit: trailing spaces
ochafik Jan 26, 2025
1159455
Merge branch 'tool-call' into tool-call-handler
ochafik Jan 26, 2025
43385b2
sync: minja
ochafik Jan 26, 2025
5ec4c5e
reshuffle chat handlers
ochafik Jan 26, 2025
f7078ca
tool-call: fix functionary v3.1 required test
ochafik Jan 26, 2025
ca0c837
nits
ochafik Jan 27, 2025
bddc1be
tool-call: fix special handling of special trigger tokens (Nemo)
ochafik Jan 27, 2025
da606d8
tool-call: remove nonsensical code_interpreter code
ochafik Jan 27, 2025
15ec01e
jinja: only add special tokens if template doesn't seem to handle them
ochafik Jan 27, 2025
2efa0c2
tool-call: add weather tool e2e tests
ochafik Jan 27, 2025
57f40e3
tool-call: fix lazy grammar & mixed content + tool calls parsing
ochafik Jan 27, 2025
6770955
tool-call: compact json output to cap # tokens generated
ochafik Jan 27, 2025
09971e6
Update test_chat_completion.py
ochafik Jan 27, 2025
92ac336
Prepare DeepSeek-R1-Distill-Llama-8B support
ochafik Jan 27, 2025
118f799
DeepSeek-R1: implement grammar constraints
ochafik Jan 27, 2025
add9124
fix test-chat-handler grammar tests
ochafik Jan 27, 2025
fa065eb
Rehabilitate test_format_detection
ochafik Jan 27, 2025
ad22978
updated tool call example to be less ambiguous (deepseek likes to ran…
ochafik Jan 27, 2025
90effb8
Pass grammar laziness all the way down to sampler (need to print spec…
ochafik Jan 27, 2025
cafea60
Split e2e test_tool_call from test_chat_completion
ochafik Jan 27, 2025
b565ab2
comment out broken tests in test_tool_call.py
ochafik Jan 27, 2025
2d607f1
Update test-chat-handler.cpp
ochafik Jan 27, 2025
ef9efc9
Fix Llama 3.1 (incl. constrained builtin tools e.g. `<|python_tag|>fo…
ochafik Jan 28, 2025
6271714
Allow tool use + streaming
ochafik Jan 28, 2025
6d56829
Cleanup dead code in llama_3_1 tool call code
ochafik Jan 28, 2025
2f99236
Tool-call: do last partial parse upon limit stop
ochafik Jan 28, 2025
0a51e51
Update test-chat-handler.cpp
ochafik Jan 28, 2025
d274ffc
build: Add missing optional include for gcc
ochafik Jan 28, 2025
62d45a5
Disable slow tests where appropriate, + nits
ochafik Jan 28, 2025
ec4aeaf
Revert "Allow tool use + streaming"
ochafik Jan 28, 2025
b5a74d1
Simplify parser defs (incremental parsing for streaming will need mor…
ochafik Jan 28, 2025
ba10b47
Add missing link dep for windows build
ochafik Jan 28, 2025
cd63ba4
beef up test-chat-handler w/ delta expectations
ochafik Jan 28, 2025
cad1448
Disable test-chat-handler on win32 like the other grammar-related tests
ochafik Jan 28, 2025
4f25755
minja: sync on https://github.com/google/minja/pull/33
ochafik Jan 28, 2025
d603d06
sync: minja
ochafik Jan 28, 2025
6426391
Fix firefunction w/ jinja: requires two variables, use the chat handl…
ochafik Jan 29, 2025
4cdbb8c
Revert breaking minja change
ochafik Jan 29, 2025
47be437
Text fireworks v2 template
ochafik Jan 29, 2025
18d5a1b
nits
ochafik Jan 29, 2025
4a1e8e9
refactor test-chat-handler
ochafik Jan 29, 2025
923c805
rm dead code + nits
ochafik Jan 29, 2025
384f54a
Split bulk of tool call tests to slow lane
ochafik Jan 29, 2025
40cc3f2
Merge branch 'tool-call' of github.com:ochafik/llama.cpp into tool-call
ochafik Jan 29, 2025
41eec46
rm unused templates, rename one
ochafik Jan 29, 2025
76f6ab1
Update test_tool_call.py
ochafik Jan 29, 2025
77dd67c
tool-calls: disable crashing tests
ochafik Jan 29, 2025
0f8af53
nits
ochafik Jan 29, 2025
babdefc
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Jan 29, 2025
682026f
Create meta-llama-Llama-3.1-8B-Instruct.jinja
ochafik Jan 29, 2025
7b5e080
Move templates/ under models/
ochafik Jan 29, 2025
ba27e98
Unify llama 3.x chat handling again (allow `{"type": "function", "nam…
ochafik Jan 29, 2025
6e676c8
sync: minja
ochafik Jan 29, 2025
ed7c622
Rename: common/chat.*, common_chat_{inputs -> params}
ochafik Jan 29, 2025
36c776f
Finish renaming of chat inputs vs. params [skip ci]
ochafik Jan 29, 2025
bc8a611
nits
ochafik Jan 29, 2025
84bc083
Remove server tests LLAMA_CACHE override (tests are serial, and the c…
ochafik Jan 29, 2025
2b24569
Add cli mode to test-chat to generate template summaries markdown
ochafik Jan 29, 2025
64545ac
Somehow /* bad inside block comments, ok fine.
ochafik Jan 29, 2025
cbecb35
Add tool call to hot topics
ochafik Jan 29, 2025
a810c37
Partial revert of LLAMA_CACHE=tmp (unless set explicitly in env)
ochafik Jan 29, 2025
77c60e6
Avoid passing tools twice in generic handler (now that minja passes t…
ochafik Jan 30, 2025
d86a1ae
Unify content + message in server_task_result_cmpl_final (+ avoid str…
ochafik Jan 30, 2025
774557c
llama 3.1: allow `{name:` & `{function:` syntax even w/ builtin tools…
ochafik Jan 30, 2025
590c979
Update tests readme + add raw output to verbose log
ochafik Jan 30, 2025
f8e14bf
split chat handler vs. parser around enum again
ochafik Jan 30, 2025
81547e6
nits
ochafik Jan 30, 2025
18450e6
debug logs are back
ochafik Jan 30, 2025
b831a6e
rm unused llama_param
ochafik Jan 30, 2025
7635912
llama 3.2 1b now fails the weather tool call?
ochafik Jan 30, 2025
9591af1
increase http timeout to 12
ochafik Jan 30, 2025
8ef37a3
Merge remote-tracking branch 'origin/master' into tool-call
ochafik Jan 30, 2025
2d51c45
code style changes on test
ngxson Jan 30, 2025
c88f4a7
simplify handle_apply_template
ngxson Jan 30, 2025
3dcde9e
Fix debug + verbose
ochafik Jan 30, 2025
06c4ca5
Update test_chat_completion.py
ochafik Jan 30, 2025
0c171f5
Update test_chat_completion.py
ochafik Jan 30, 2025
9685043
Update scripts/fetch_server_test_models.py to new compact hf_repo syn…
ochafik Jan 30, 2025
2bb3fed
nit: fix py import
ochafik Jan 30, 2025
7d59bf4
deprecate llama_sampler_init_grammar -> llama_sampler_grammar_init
ochafik Jan 30, 2025
5a64af6
add llama_sampler_init_grammar_lazy instead of renaming the non-lazy
ochafik Jan 30, 2025
f223df0
Format test-chat.cpp
ochafik Jan 30, 2025
8205246
log prompt + nits
ochafik Jan 30, 2025
5add261
test: leave model_hf_file blank
ngxson Jan 30, 2025
1029ff9
force printing </tool_call> on hermes 2 model if/as it's a special token
ochafik Jan 30, 2025
3bd6abe
try and avoid weird server test failure (spillage / parallelism betwe…
ochafik Jan 30, 2025
729d2d3
Disable chat_completion tests of non-tool jinja mode
ochafik Jan 30, 2025
34f54dd
Fix typo
ochafik Jan 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,11 @@ indent_style = tab
[examples/cvector-generator/*.txt]
trim_trailing_whitespace = unset
insert_final_newline = unset

[models/templates/*.jinja]
indent_style = unset
indent_size = unset
end_of_line = unset
charset = unset
trim_trailing_whitespace = unset
insert_final_newline = unset
2 changes: 1 addition & 1 deletion .github/workflows/server.yml
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ jobs:
run: |
cd examples/server/tests
$env:PYTHONIOENCODING = ":replace"
pytest -v -x
pytest -v -x -m "not slow"
- name: Slow tests
id: server_integration_tests_slow
Expand Down
9 changes: 9 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ TEST_TARGETS = \
tests/test-arg-parser \
tests/test-autorelease \
tests/test-backend-ops \
tests/test-chat \
tests/test-chat-template \
tests/test-double-float \
tests/test-grammar-integration \
Expand Down Expand Up @@ -983,6 +984,7 @@ OBJ_COMMON = \
$(DIR_COMMON)/ngram-cache.o \
$(DIR_COMMON)/sampling.o \
$(DIR_COMMON)/speculative.o \
$(DIR_COMMON)/chat.o \
$(DIR_COMMON)/build-info.o \
$(DIR_COMMON)/json-schema-to-grammar.o

Expand Down Expand Up @@ -1361,6 +1363,8 @@ llama-server: \
examples/server/httplib.h \
examples/server/index.html.hpp \
examples/server/loading.html.hpp \
common/chat.cpp \
common/chat.hpp \
common/chat-template.hpp \
common/json.hpp \
common/minja.hpp \
Expand Down Expand Up @@ -1471,6 +1475,11 @@ tests/test-json-schema-to-grammar: tests/test-json-schema-to-grammar.cpp \
$(CXX) $(CXXFLAGS) -Iexamples/server -c $< -o $(call GET_OBJ_FILE, $<)
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)

tests/test-chat: tests/test-chat.cpp \
$(OBJ_ALL)
$(CXX) $(CXXFLAGS) -Iexamples/server -c $< -o $(call GET_OBJ_FILE, $<)
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)

tests/test-opt: tests/test-opt.cpp \
$(OBJ_GGML)
$(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)

- **How to use [MTLResidencySet](https://developer.apple.com/documentation/metal/mtlresidencyset?language=objc) to keep the GPU memory active?** https://github.com/ggerganov/llama.cpp/pull/11427
- **VS Code extension for FIM completions:** https://github.com/ggml-org/llama.vscode
- Universal tool call support in `llama-server`: https://github.com/ggerganov/llama.cpp/pull/9639
- Vim/Neovim plugin for FIM completions: https://github.com/ggml-org/llama.vim
- Introducing GGUF-my-LoRA https://github.com/ggerganov/llama.cpp/discussions/10123
- Hugging Face Inference Endpoints now support GGUF out of the box! https://github.com/ggerganov/llama.cpp/discussions/9669
Expand Down
2 changes: 2 additions & 0 deletions common/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ add_library(${TARGET} STATIC
arg.cpp
arg.h
base64.hpp
chat.cpp
chat.hpp
chat-template.hpp
common.cpp
common.h
Expand Down
Loading
Loading