Skip to content

Actions: ZX-ModelCloud/GPTQModel

Actions

Ruff Check

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
289 workflow runs
289 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Update mem.h
Ruff Check #39: Commit 3788c93 pushed by Qubitium
July 25, 2024 13:56 21s zx_add_marlin_2_4
July 25, 2024 13:56 21s
Update base.h
Ruff Check #38: Commit 56a4c53 pushed by Qubitium
July 25, 2024 13:56 17s zx_add_marlin_2_4
July 25, 2024 13:56 17s
Update LICENSE
Ruff Check #37: Commit acf6c74 pushed by Qubitium
July 25, 2024 13:56 25s zx_add_marlin_2_4
July 25, 2024 13:56 25s
add "gptqmodel_marlin_cuda_24" CUDAExtension
Ruff Check #36: Commit 9d297be pushed by ZX-ModelCloud
July 25, 2024 12:28 22s zx_add_marlin_2_4
July 25, 2024 12:28 22s
fix missing self.quantize_config.seqlen (#298)
Ruff Check #35: Commit 01e4c96 pushed by ZX-ModelCloud
July 25, 2024 10:24 16s main
July 25, 2024 10:24 16s
QuantizeConfig add "runtime_format" field
Ruff Check #34: Commit fc406b1 pushed by ZX-ModelCloud
July 25, 2024 02:14 16s zx_fix_save_quantized
July 25, 2024 02:14 16s
cleanup
Ruff Check #33: Commit edd7f27 pushed by ZX-ModelCloud
July 25, 2024 01:27 16s zx_fix_save_quantized
July 25, 2024 01:27 16s
set to 40m (#295)
Ruff Check #31: Commit 19b52df pushed by ZX-ModelCloud
July 25, 2024 01:16 17s main
July 25, 2024 01:16 17s
add unit test
Ruff Check #30: Commit c552d25 pushed by ZX-ModelCloud
July 25, 2024 01:14 19s zx_fix_save_quantized
July 25, 2024 01:14 19s
[FIX] allow auto_round lm_head quantization (#282)
Ruff Check #29: Commit 015a76f pushed by ZX-ModelCloud
July 24, 2024 01:32 14s main
July 24, 2024 01:32 14s
revert changes (#274)
Ruff Check #27: Commit f50b228 pushed by ZX-ModelCloud
July 23, 2024 15:47 20s main
July 23, 2024 15:47 20s
[FEATURE] Add GPTQModel.shard_quantized() api (#271)
Ruff Check #25: Commit 88392c7 pushed by ZX-ModelCloud
July 23, 2024 09:04 26s main
July 23, 2024 09:04 26s
Update base.py
Ruff Check #24: Commit 8810bbb pushed by Qubitium
July 23, 2024 07:26 39s zx_add_shard_quantized_function
July 23, 2024 07:26 39s
check bitblas
Ruff Check #23: Commit 2c20b76 pushed by ZX-ModelCloud
July 23, 2024 07:10 19s zx_add_shard_quantized_function
July 23, 2024 07:10 19s
cleanup
Ruff Check #22: Commit df8cb18 pushed by ZX-ModelCloud
July 23, 2024 07:09 15s zx_add_shard_quantized_function
July 23, 2024 07:09 15s
cleanup
Ruff Check #21: Commit 0ed0d49 pushed by ZX-ModelCloud
July 23, 2024 05:47 18s zx_add_shard_quantized_function
July 23, 2024 05:47 18s
cleanup
Ruff Check #20: Commit 3d45fd8 pushed by ZX-ModelCloud
July 23, 2024 05:27 20s zx_add_shard_quantized_function
July 23, 2024 05:27 20s
add shard_quantized()
Ruff Check #19: Commit 21c5712 pushed by ZX-ModelCloud
July 23, 2024 05:24 20s zx_add_shard_quantized_function
July 23, 2024 05:24 20s
add vllm sharded test (#267)
Ruff Check #18: Commit 5f5eae6 pushed by ZX-ModelCloud
July 23, 2024 05:22 24s main
July 23, 2024 05:22 24s
[CI] Print Python & Cuda env (#263)
Ruff Check #17: Commit 7f84ba8 pushed by ZX-ModelCloud
July 23, 2024 02:44 21s main
July 23, 2024 02:44 21s
fix flashinfer is optional depend (#262)
Ruff Check #16: Commit 1b95374 pushed by ZX-ModelCloud
July 23, 2024 01:13 22s main
July 23, 2024 01:13 22s
[FIX] [MISC] Update test (#177)
Ruff Check #15: Commit 399691e pushed by ZX-ModelCloud
July 7, 2024 01:34 18s main
July 7, 2024 01:34 18s
ProTip! You can narrow down the results and go further in time using created:<2024-07-07 or the other filters available.