[auto] Sync version 2403021812.0.0+llamacpp-release.b2316 · KerfuffleV2/ggml-sys-bleedingedge@d43d8b4

Commit

[auto] Sync version 2403021812.0.0+llamacpp-release.b2316

== Relevant log messages from source repo:

commit bbde6eb2561153aabbdfac5001c690fe00cad639
Author: Kawrakow <48489457+ikawrakow@users.noreply.github.com>
Date:   Sat Mar 2 17:00:51 2024 +0200

    ggml : IQ3_S improvements (#5829)

    * iq3_s: somewhat faster AVX2 dot product

    On Ryzen a 7950X TG-128 increases to 16 t/s from 15.5 t/s using
    16 threads. For 8 threads it is 13.85 t/s vs 11.75 t/s.
    PP-512 increases to 28.5 t/s from 23.8 t/s.

    * iq3_s: somewhat faster ARM_NEON dot product

    Still dog slow - 10.7 t/s up from 9.9 t/s.

    * iq3_s: another small ARM_NEON improvement

    10.7 -> 11.0 t/s. Using vmulq_s8 is faster than the xor - sub trick
    that works best on AVX2.

    * iq3_s: minor improvement on Metal

    49.4 t/s -> 50.3 t/s

    * iq3_s: PPL improvement

    E.g., for a context of 4096 LLaMA-v2-7B goes to 5.1340 from 5.1653.

    * iq3_s: use new grid everywhere

    * Fix ARM_NEON

    ---------

    Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

commit 6c32d8c7ad8ba7b6ad2a162e929a21dd04fcdca0
Author: Xuan Son Nguyen <thichthat@gmail.com>
Date:   Sat Mar 2 15:19:09 2024 +0100

    llama : refactor internal quantization functions (#5830)

commit 802da0091ba646ecf02e1a8fae2da0b8e76409bd
Author: compilade <113953597+compilade@users.noreply.github.com>
Date:   Sat Mar 2 08:42:56 2024 -0500

    llama : fix segfault from unknown model arch name (#5820)

    * llama : fix segfault from unknown model arch name

    * llama : make all LLM maps const

    This also requires using `std::map::at` instead of its `operator[]`
    which does not exist for const maps.

    * llama : name LLM_ARCH_UNKNOWN to "(unknown)"

    This avoids errors from `std::map::at` when
    getting the general name of the model architecture.
    Using "(unknown)" instead of an empty string as per suggestion
    ggerganov/llama.cpp#5820 (comment)

    * llama : remove redundant inner const for LLM_TENSOR_NAMES

    The extra const won't do anything here as const maps
    return const references to values.

    Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

    * llama : remove redundant nullptr check in llm_arch_from_string

    Since LLM_ARCH_NAMES is a const map, no spurious elements
    with a NULL name are inserted anymore, so this check is dead code.

    ---------

    Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

commit 715641391dda1ff9762dc5d99d9a30acce99f2c6
Author: Neo Zhang Jianyu <jianyu.zhang@intel.com>
Date:   Sat Mar 2 19:49:30 2024 +0800

    Support multiple GPUs (split mode) on SYCL backend (#5806)

    * suport multiple cards: split-mode - layer|row

    * rm warning

    * rebase with master, support tow new OPs, close feature for -sm=row, fix for unit test

    * update news

    * fix merge error

    * update according to review comments

Loading branch information

github-actions committed Mar 2, 2024

1 parent 7a6249a commit d43d8b4

Cargo.toml

            
                      Original file line number
                      Diff line number
                      Diff line change
                  
    @@ -1,6 +1,6 @@
  
    [package]

    name = "ggml-sys-bleedingedge"

    version = "2403020043.0.0+llamacpp-release.b2308"

    version = "2403021812.0.0+llamacpp-release.b2316"

    description = "Bleeding edge low-level bindings to GGML. "

    repository = "https://github.com/KerfuffleV2/ggml-sys-bleedingedge"

    keywords = ["deep-learning", "machine-learning", "tensors", "ggml", "ml"]

VERSION.txt

            
                      Original file line number
                      Diff line number
                      Diff line change
                  
    @@ -1 +1 @@
  
    2403020043.0.0+llamacpp-release.b2308

    2403021812.0.0+llamacpp-release.b2316

ggml-tag-current.txt

Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		b2308
		b2316

ggml-tag-previous.txt

Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		b2303
		b2308

0 comments on commit `d43d8b4`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `d43d8b4`

Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		2403020043.0.0+llamacpp-release.b2308
		2403021812.0.0+llamacpp-release.b2316

Commit

There are no files selected for viewing

0 comments on commit d43d8b4

0 comments on commit `d43d8b4`