Releases: NeoZhangJianyu/llama.cpp
Releases · NeoZhangJianyu/llama.cpp
b2437
fix set main gpu error (#6073)
b2431
llama : fix integer overflow during quantization (#6063)
b2409
ci : remove tidy-review (#6021)
b2408
ggml : reuse quantum structs across backends (#5943) * ggml : reuse quant blocks across backends ggml-ci * ggml : define helper constants only for CUDA and SYCL ggml-ci * ggml : define helper quantum constants for SYCL ggml-ci
b2407
ggml : fix UB in IQ2_S and IQ3_S (#6012)
b2405
grammar : fix unnecessarily retained pointer to rules (#6003)
b2351
compare-llama-bench.py : remove mul_mat_q (#5892)
b2343
fix speculative decoding build on windows (#5874)
b2312
Support multiple GPUs (split mode) on SYCL backend (#5806) * suport multiple cards: split-mode - layer|row * rm warning * rebase with master, support tow new OPs, close feature for -sm=row, fix for unit test * update news * fix merge error * update according to review comments
b2282
Attempt to fix android build (#5752) Co-authored-by: Iwan Kawrakow <[email protected]>