v0.5.2
What's Changed
A few fixes and new additions:
- Support for CohereAI's command-r model: Currently, GGUF is unsupported. You can load the base model with
--load-in-4bit
or--load-in-smooth
if you have an RTX 20xx series (or sm_75). - Fix an issue where some GPU blocks were missing. This should give a significant boost to how much context you can use.
- Fix logprobs when -inf with some models.
Full Changelog: v0.5.1...v0.5.2