Skip to content

Releases: ggml-org/llama.cpp

b6980

08 Nov 00:10
299f5d7

Choose a tag to compare

CUDA: properly handle nb00=nb02 case for cpy (#17081)

b6979

07 Nov 22:57
ac76d36

Choose a tag to compare

vulkan : refactor buffer handling in vk_op_f32 (#16840)

* vulkan : refactor/simplify buffer handling in vk_op_* functions

* Combine UMA handling into ggml_vk_tensor_subbuffer

b6978

07 Nov 22:50
6515610

Choose a tag to compare

CUDA: fix should_use_mmvf for ne11 == 1 (#17085)

* CUDA: fix should_use_mmvf for ne11 == 1

* Apply suggestion from @am17an

Co-authored-by: Aman Gupta <[email protected]>

---------

Co-authored-by: Aman Gupta <[email protected]>

b6977

07 Nov 22:35
7956bb4

Choose a tag to compare

bench : cache the llama_context state at computed depth (#16944)

* bench : cache llama_context state at depth

* cont : handle failures to restore the old state

* cont : print information when the state is being reused

b6976

07 Nov 22:25
9008027

Choose a tag to compare

hparams : add n_embd_inp() to support extended embed (#16928)

* add n_embd_full to support extended embed

* don't change output

* rename to n_embd_inp

* restore n_embd where applicable

b6975

07 Nov 21:13
16bcc12

Choose a tag to compare

kv-cache : pad the cache size to 256 for performance (#17046)

* kv-cache : pad the size of the small SWA cache for performance

* context : pad the total context to 256

* cont : future-proof the swa pad

* server : adjust test params to new logic

b6974

07 Nov 19:00
9eb9a13

Choose a tag to compare

Revert "ggml-cpu: detect correct cpu flags for arm64 (#16229) (#16239…

b6973

07 Nov 18:31
7c23f3f

Choose a tag to compare

ggml-cpu: detect correct cpu flags for arm64 (#16229) (#16239)

When using GCC 9 and GCC 12 on the arm64 platform of ubuntu 2004,
the command "gcc -mcpu=native -E -v -" fails to detect the correct CPU flags,
which results in compilation failures for certain extended instructions,
but the correct CPU flags can be obtained by using gcc -march.

Signed-off-by: lizhenneng <[email protected]>
Co-authored-by: lizhenneng <[email protected]>

b6972

07 Nov 11:13
8c0d6bb

Choose a tag to compare

server : print the samplers chain for each request (#17070)

b6971

07 Nov 11:11
5c9a18e

Choose a tag to compare

common: move download functions to download.(cpp|h) (#17059)

* common: move download functions to download.(cpp|h)

* rm unused includes

* minor cleanup

---------

Co-authored-by: Georgi Gerganov <[email protected]>