Releases · NeoZhangJianyu/llama.cpp

15 Mar 11:54

46acb36

b2437

fix set main gpu error (#6073)

Assets 15

15 Mar 06:37

github-actions

b2431

4755afd

b2431

llama : fix integer overflow during quantization (#6063)

Assets 15

13 Mar 03:03

github-actions

b2409

306d34b

b2409

ci : remove tidy-review (#6021)

Assets 15

12 Mar 13:07

github-actions

b2408

8030da7

b2408

ggml : reuse quantum structs across backends (#5943)

* ggml : reuse quant blocks across backends

ggml-ci

* ggml : define helper constants only for CUDA and SYCL

ggml-ci

* ggml : define helper quantum constants for SYCL

ggml-ci

Assets 15

12 Mar 12:19

github-actions

b2407

184215e

b2407

ggml : fix UB in IQ2_S and IQ3_S (#6012)

Assets 15

12 Mar 04:12

github-actions

b2405

5cdb371

b2405

grammar : fix unnecessarily retained pointer to rules (#6003)

Assets 15

06 Mar 02:44

github-actions

b2351

652ca2b

b2351

compare-llama-bench.py : remove mul_mat_q (#5892)

Assets 14

05 Mar 06:25

github-actions

b2343

29eee40

b2343

fix speculative decoding build on windows (#5874)

Assets 14

02 Mar 12:14

github-actions

b2312

7156413

b2312

Support multiple GPUs (split mode) on SYCL backend (#5806)

* suport multiple cards: split-mode - layer|row

* rm warning

* rebase with master, support tow new OPs, close feature for -sm=row, fix for unit test

* update news

* fix merge error

* update according to review comments

Assets 14

28 Feb 03:50

github-actions

b2282

cb49e0f

b2282

Attempt to fix android build (#5752)

Co-authored-by: Iwan Kawrakow <[email protected]>

Assets 14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: NeoZhangJianyu/llama.cpp

b2437

b2431

b2409

b2408

b2407

b2405

b2351

b2343

b2312

b2282