Releases · NeoZhangJianyu/llama.cpp

27 Dec 02:24

d79d8f3

b4393 Latest

Latest

vulkan: multi-row k quants (#10846)

* multi row k quant shaders!

* better row selection

* more row choices

* readjust row selection

* rm_kq=2 by default

Assets 23

cudart-llama-bin-win-cu11.7-x64.zip

303 MB 2024-12-27T02:24:25Z
cudart-llama-bin-win-cu12.4-x64.zip

373 MB 2024-12-27T02:24:31Z
llama-b4393-bin-macos-arm64.zip

60.2 MB 2024-12-27T02:24:38Z
llama-b4393-bin-macos-x64.zip

61 MB 2024-12-27T02:24:40Z
llama-b4393-bin-ubuntu-x64.zip

67.1 MB 2024-12-27T02:24:41Z
llama-b4393-bin-win-avx-x64.zip

9.76 MB 2024-12-27T02:24:43Z
llama-b4393-bin-win-avx2-x64.zip

9.76 MB 2024-12-27T02:24:43Z
llama-b4393-bin-win-avx512-x64.zip

9.78 MB 2024-12-27T02:24:44Z
llama-b4393-bin-win-cuda-cu11.7-x64.zip

147 MB 2024-12-27T02:24:45Z
llama-b4393-bin-win-cuda-cu12.4-x64.zip

147 MB 2024-12-27T02:24:49Z
Source code (zip)

2024-12-26T15:54:44Z
Source code (tar.gz)

2024-12-26T15:54:44Z

20 Dec 08:33

github-actions

b4367

d408bb9

b4367

clip : disable GPU support (#10896)

ggml-ci

Assets 23

26 Nov 05:29

github-actions

b4176

05f9de9

b4176

Merge pull request #4 from NeoZhangJianyu/correct_win_release

restore the condistion to build & update pacakge when merge

Assets 22

26 Nov 03:50

github-actions

b4174

0eb4e12

b4174

vulkan: Fix a vulkan-shaders-gen arugment parsing error (#10484)

The vulkan-shaders-gen was not parsing the --no-clean argument correctly.
Because the previous code was parsing the arguments which have a value only
and the --no-clean argument does not have a value, it was not being parsed
correctly. This commit can now correctly parse arguments that don't have values.

Assets 21

26 Nov 03:44

github-actions

b4164

8965d05

b4164

Merge pull request #3 from NeoZhangJianyu/fix_win_package

fix build package for 2025.0

Assets 21

25 Nov 04:08

github-actions

b4158

cce5a90

b4158

flake.lock: Update (#10470)

Flake lock file updates:

• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/5e4fbfb6b3de1aa2872b76d49fafc942626e2add?narHash=sha256-OZiZ3m8SCMfh3B6bfGC/Bm4x3qc1m2SVEAlkV6iY7Yg%3D' (2024-11-15)
  → 'github:NixOS/nixpkgs/23e89b7da85c3640bbc2173fe04f4bd114342367?narHash=sha256-y/MEyuJ5oBWrWAic/14LaIr/u5E0wRVzyYsouYY3W6w%3D' (2024-11-19)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Assets 22

21 Nov 04:33

github-actions

b4145

9abe9ee

b4145

vulkan: predicate max operation in soft_max shaders/soft_max (#10437)

Fixes #10434

Assets 21

19 Nov 01:49

github-actions

b4127

557924f

b4127

sycl: Revert MUL_MAT_OP support changes (#10385)

Assets 21

20 Oct 03:28

github-actions

b3943

cda0e4b

b3943

llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)

* refactor llama_batch_get_one

* adapt all examples

* fix simple.cpp

* fix llama_bench

* fix

* fix context shifting

* free batch before return

* use common_batch_add, reuse llama_batch in loop

* null terminated seq_id list

* fix save-load-state example

* fix perplexity

* correct token pos in llama_batch_allocr

Assets 22

18 Oct 14:25

github-actions

b3942

afd9909

b3942

rpc : backend refactoring (#9912)

* rpc : refactor backend

Use structs for RPC request/response messages

* rpc : refactor server

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: NeoZhangJianyu/llama.cpp

b4393

b4367

b4176

b4174

b4164

b4158

b4145

b4127

b3943

b3942