Skip to content

Releases: NeoZhangJianyu/llama.cpp

b2437

15 Mar 11:54
46acb36
Compare
Choose a tag to compare
fix set main gpu error (#6073)

b2431

15 Mar 06:37
4755afd
Compare
Choose a tag to compare
llama : fix integer overflow during quantization (#6063)

b2409

13 Mar 03:03
306d34b
Compare
Choose a tag to compare
ci : remove tidy-review (#6021)

b2408

12 Mar 13:07
8030da7
Compare
Choose a tag to compare
ggml : reuse quantum structs across backends (#5943)

* ggml : reuse quant blocks across backends

ggml-ci

* ggml : define helper constants only for CUDA and SYCL

ggml-ci

* ggml : define helper quantum constants for SYCL

ggml-ci

b2407

12 Mar 12:19
184215e
Compare
Choose a tag to compare
ggml : fix UB in IQ2_S and IQ3_S (#6012)

b2405

12 Mar 04:12
5cdb371
Compare
Choose a tag to compare
grammar : fix unnecessarily retained pointer to rules (#6003)

b2351

06 Mar 02:44
652ca2b
Compare
Choose a tag to compare
compare-llama-bench.py : remove mul_mat_q (#5892)

b2343

05 Mar 06:25
29eee40
Compare
Choose a tag to compare
fix speculative decoding build on windows (#5874)

b2312

02 Mar 12:14
7156413
Compare
Choose a tag to compare
Support multiple GPUs (split mode) on SYCL backend (#5806)

* suport multiple cards: split-mode - layer|row

* rm warning

* rebase with master, support tow new OPs, close feature for -sm=row, fix for unit test

* update news

* fix merge error

* update according to review comments

b2282

28 Feb 03:50
cb49e0f
Compare
Choose a tag to compare
Attempt to fix android build (#5752)

Co-authored-by: Iwan Kawrakow <[email protected]>