Skip to content

Releases: NeoZhangJianyu/llama.cpp

b4393

27 Dec 02:24
d79d8f3
Compare
Choose a tag to compare
vulkan: multi-row k quants (#10846)

* multi row k quant shaders!

* better row selection

* more row choices

* readjust row selection

* rm_kq=2 by default

b4367

20 Dec 08:33
d408bb9
Compare
Choose a tag to compare
clip : disable GPU support (#10896)

ggml-ci

b4176

26 Nov 05:29
05f9de9
Compare
Choose a tag to compare
Merge pull request #4 from NeoZhangJianyu/correct_win_release

restore the condistion to build & update pacakge when merge

b4174

26 Nov 03:50
0eb4e12
Compare
Choose a tag to compare
vulkan: Fix a vulkan-shaders-gen arugment parsing error (#10484)

The vulkan-shaders-gen was not parsing the --no-clean argument correctly.
Because the previous code was parsing the arguments which have a value only
and the --no-clean argument does not have a value, it was not being parsed
correctly. This commit can now correctly parse arguments that don't have values.

b4164

26 Nov 03:44
8965d05
Compare
Choose a tag to compare
Merge pull request #3 from NeoZhangJianyu/fix_win_package

fix build package for 2025.0

b4158

25 Nov 04:08
cce5a90
Compare
Choose a tag to compare
flake.lock: Update (#10470)

Flake lock file updates:

• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/5e4fbfb6b3de1aa2872b76d49fafc942626e2add?narHash=sha256-OZiZ3m8SCMfh3B6bfGC/Bm4x3qc1m2SVEAlkV6iY7Yg%3D' (2024-11-15)
  → 'github:NixOS/nixpkgs/23e89b7da85c3640bbc2173fe04f4bd114342367?narHash=sha256-y/MEyuJ5oBWrWAic/14LaIr/u5E0wRVzyYsouYY3W6w%3D' (2024-11-19)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

b4145

21 Nov 04:33
9abe9ee
Compare
Choose a tag to compare
vulkan: predicate max operation in soft_max shaders/soft_max (#10437)

Fixes #10434

b4127

19 Nov 01:49
557924f
Compare
Choose a tag to compare
sycl: Revert MUL_MAT_OP support changes (#10385)

b3943

20 Oct 03:28
cda0e4b
Compare
Choose a tag to compare
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)

* refactor llama_batch_get_one

* adapt all examples

* fix simple.cpp

* fix llama_bench

* fix

* fix context shifting

* free batch before return

* use common_batch_add, reuse llama_batch in loop

* null terminated seq_id list

* fix save-load-state example

* fix perplexity

* correct token pos in llama_batch_allocr

b3942

18 Oct 14:25
afd9909
Compare
Choose a tag to compare
rpc : backend refactoring (#9912)

* rpc : refactor backend

Use structs for RPC request/response messages

* rpc : refactor server