Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge from upstream #22

Merged
merged 84 commits into from
Jun 5, 2024
Merged
Changes from 1 commit
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
1d8fca7
metal : add GGML_OP_REPEAT kernels (#7557)
ggerganov May 27, 2024
5487593
Add freq factors (#7495)
AidanBeltonS May 27, 2024
95f84d5
Fix q_xxs using mul_mat_q (#7459)
AidanBeltonS May 27, 2024
197c006
Allow multiple copy function pointers for CUDA graph kernel param upd…
agray3 May 27, 2024
10b1e45
make: add --device-debug to NVCC debug flags (#7542)
JohannesGaessler May 27, 2024
0136966
adding in x64 targets to cmake presets (#7574)
kunnis May 27, 2024
852aafb
update HIP_UMA #7399 (#7414)
Djip007 May 27, 2024
74b239b
llava : update clip.h (#7580)
eltociear May 28, 2024
c417671
Markdownish code block fix (#7571)
nathan-sixnines May 28, 2024
9335b96
server: do not remove whitespace at the start of a completion chunk (…
mgroeber9110 May 28, 2024
0548a41
ggml : generalize GGML_OP_CONCAT (#7563)
ggerganov May 28, 2024
e2b0650
[SYCL]fix ggml_sycl_mul_mat_id() to match the change of api (#7436)
arthw May 28, 2024
271ff3f
github: add refactor to issue template (#7561)
mofosyne May 28, 2024
8b99e2a
llama : handle unknown utf8 bytes (#7588)
ggerganov May 28, 2024
edc2943
tests : fix test-tokenizer-0.sh
ggerganov May 28, 2024
ee3dff6
Add support for DeepseekV2ForCausalLM (#7519)
fairydreaming May 28, 2024
2b737ca
rpc : resource management rework (#7562)
rgerganov May 28, 2024
56411a9
vulkan: properly initialize vulkan devices for LLAMA_SPLIT_MODE_NONE …
Adriankhl May 28, 2024
5442939
llama : support small Granite models (#7481)
giuseppe May 28, 2024
6bd12ce
sycl : fix assert (#7563)
ggerganov May 28, 2024
02c1eca
Tokenizer WPM fixes (#7500)
jaime-m-p May 28, 2024
b864b50
[SYCL] Align GEMM dispatch (#7566)
airMeng May 28, 2024
504f0c3
ggml : fix typo in ggml.c (#7603)
zhouwg May 29, 2024
0e8d8bf
Add Arc A750 and Arch linux to readme-sycl.md as verified GPU model a…
May 29, 2024
72de268
ggml : restore ggml_rope_xpos_inplace (ggml/0)
ggerganov May 26, 2024
2ab9772
sync : ggml
ggerganov May 29, 2024
00281b7
scripts : remove mpi remnants
ggerganov May 29, 2024
87bdf2a
ggml : use atomic_flag for critical section (#7598)
slaren May 29, 2024
210d991
llama-bench : add support for the RPC backend (#7435)
rgerganov May 29, 2024
cce3dcf
cuda : non-cont concat support (#7610)
ggerganov May 29, 2024
fb76ec3
ggml : fix YARN + add tests + add asserts (#7617)
ggerganov May 29, 2024
975ec63
metal : add missing asserts (#7617)
ggerganov May 29, 2024
55d6226
metal : remove invalid asserts (#7617)
ggerganov May 29, 2024
eb57fee
gguf-py : Add tokenizer.ggml.pre to gguf-new-metadata.py (#7627)
Galunid May 30, 2024
3854c9d
[SYCL] fix intel docker (#7630)
airMeng May 30, 2024
972b555
README: explain parallel build [no ci] (#7618)
JohannesGaessler May 30, 2024
d5c0582
ggml : fix loongarch build (O2 issue) (#7636)
junchao-loongson May 30, 2024
59b0d07
faster avx512 exp implementation (#7551)
chriselrod May 30, 2024
9c4c9cc
Move convert.py to examples/convert-legacy-llama.py (#7430)
Galunid May 30, 2024
e6157f9
github: add contact links to issues and convert question into researc…
mofosyne May 30, 2024
7846540
readme : add Conan badge (#7638)
MartinDelille May 30, 2024
2e2340d
Add brew installation instruction to README [no ci] (#7616)
makuche May 30, 2024
5dcdf94
Fix conan badge display [no ci] (#7645)
MartinDelille May 30, 2024
5921b8f
llama : cache llama_token_to_piece (#7587)
ggerganov May 30, 2024
9022c33
Fixed painfully slow single process builds. (#7326)
jboero May 30, 2024
0541f06
[no ci] docs: add aikit to readme (#7650)
sozercan May 30, 2024
1af511f
Add convert.py removal to hot topics (#7662)
Galunid May 31, 2024
2e32f87
Somehow '**' got lost (#7663)
Galunid May 31, 2024
0c27e6f
ggml : fix loongson compile warnings (#7537)
ggerganov May 31, 2024
16926df
readme : link homebrew discussion
ggerganov May 31, 2024
30e238b
Improve HIP compatibility (#7672)
daniandtheweb May 31, 2024
c8047d5
scripts: update compare_llama_bench.py [no ci] (#7673)
JohannesGaessler May 31, 2024
0515ad9
convert-hf : Handle NotImplementedError in convert-hf-to-gguf (#7660)
Galunid May 31, 2024
a323ec6
server : update js (#7670)
ggerganov May 31, 2024
9b59641
CUDA: quantized KV support for FA vec (#7527)
JohannesGaessler Jun 1, 2024
750f60c
CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
JohannesGaessler Jun 1, 2024
2ac95c9
SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, S…
hanishkvc Jun 1, 2024
2e66683
server : new UI (#7633)
mounta11n Jun 1, 2024
e141ce6
Fix FlashAttention debug test, FP32 assert (#7684)
JohannesGaessler Jun 1, 2024
9422c5e
[SYCL] Update rpc-server.cpp to include SYCL backend (#7682)
nickp27 Jun 2, 2024
7c4e5b7
chore : add ignore rule for generated server themes (#7689)
teleprint-me Jun 2, 2024
1669810
flake.lock: Update (#7686)
ggerganov Jun 2, 2024
3413ae2
fix bug introduced in using calloc (#7701)
airlied Jun 2, 2024
9e405b6
kompute : implement op_getrows_f32 (#6403)
woachk Jun 3, 2024
549279d
llama : avoid double token-to-piece cache (#7654)
ggerganov Jun 3, 2024
6f28a33
llama : MiniCPM support tied embeddings (#7664)
zkh2016 Jun 3, 2024
a10cda5
cmake : add pkg-config spec file for llama.cpp (#7702)
andy-tai Jun 3, 2024
3d7ebf6
Vulkan Mixture of Experts (MoE) support (#7628)
0cc4m Jun 3, 2024
0b832d5
make: fix debug options not being applied to NVCC (#7714)
JohannesGaessler Jun 3, 2024
a5735e4
ggml : use OpenMP as a thread pool (#7606)
msy-kato Jun 3, 2024
bde7cd3
llama : offload to RPC in addition to other backends (#7640)
rgerganov Jun 3, 2024
6d16169
ggml : prevent builds with -ffinite-math-only (#7726)
ggerganov Jun 4, 2024
3b38d48
Per token attributes (#7685)
jaime-m-p Jun 4, 2024
b226c12
refine .gitignore (#7688)
zhouwg Jun 4, 2024
987d743
Improve hipBLAS support in CMake (#7696)
daniandtheweb Jun 4, 2024
adc9ff3
llama-bench : allow using a different printer for stderr with -oe (#7…
slaren Jun 4, 2024
5ca0944
readme : remove obsolete Zig instructions (#7471)
ggerganov Jun 4, 2024
0cd6bd3
llama : remove beam search (#7736)
ggerganov Jun 4, 2024
554c247
ggml : remove OpenCL (#7735)
ggerganov Jun 4, 2024
1442677
common : refactor cli arg parsing (#7675)
ggerganov Jun 4, 2024
b90dc56
Allow number of nodes in CUDA graph to change (#7738)
agray3 Jun 4, 2024
c90dbe0
Fix per token atrributes bits (#7749)
jaime-m-p Jun 4, 2024
9973e81
readme : remove -ins (#7759)
arch-btw Jun 5, 2024
0ac83d0
Merge branch 'layla-build' into merge
l3utterfly Jun 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
sync : ggml
ggerganov committed May 29, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
commit 2ab977282b02ccd6783fbbaec393c96886cf33b1
2 changes: 1 addition & 1 deletion scripts/sync-ggml.last
Original file line number Diff line number Diff line change
@@ -1 +1 @@
126d34985705a5a2222723c145cb4e125ac689f3
2aae01fd9b8f9399f343cf18f46f38996ef52e2c