v0.5.2

github-actions released this 16 Mar 22:50

· 479 commits to main since this release

ed225f5

What's Changed

A few fixes and new additions:

Support for CohereAI's command-r model: Currently, GGUF is unsupported. You can load the base model with --load-in-4bit or --load-in-smooth if you have an RTX 20xx series (or sm_75).
Fix an issue where some GPU blocks were missing. This should give a significant boost to how much context you can use.
Fix logprobs when -inf with some models.

Full Changelog: v0.5.1...v0.5.2

Assets 10