model | inference | size | quantized link |
---|---|---|---|
Llama2 | LLaMA | 7B | huggingface |
OpenLLaMa v2 | LLaMA | 3B | huggingface |
ORCA | LLaMA | 3B | huggingface |
ORCA v3 | LLaMA | 7B | huggingface |
Marx v2 | LLaMA | 3B | huggingface |
RWKV 4 Raven | RWKV | 3B | huggingface |
StableLM-3B-4E1T | GPT-NeoX | 3B | huggingface |
Pythia | GPT-NeoX | 2.8B | huggingface |
Cerebras | GPT-2 | 1.3B | huggingface |
MagicPrompt SDiffusion | GPT-2 | 111M | huggingface |
Replit | Replit | 3B | huggingface |
Santacoder | Starcoder | 1B | huggingface |
MPT-7B-StoryWriter-65k+ | MTP | 7B | huggingface |
Bloomz | Bloom | 1.7B | huggingface |