Skip to content

v2.2

Compare
Choose a tag to compare
@DefTruth DefTruth released this 04 Sep 06:22
· 39 commits to main since this release
6d7e9f8

What's Changed

  • Add NanoFlow code link by @DefTruth in #51
  • 🔥[ACTIVATION SPARSITY] TRAINING-FREE ACTIVATION SPARSITY IN LARGE LANGUAGE MODELS by @DefTruth in #52
  • 🔥[Decentralized LLM] Decentralized LLM Inference over Edge Networks with Energy Harvesting by @DefTruth in #53
  • 🔥[SJF Scheduling] Efficient LLM Scheduling by Learning to Rank by @DefTruth in #54
  • 🔥[Speculative Decoding] Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation by @DefTruth in #55
  • 🔥🔥[Prompt Compression] Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference by @DefTruth in #56
  • 🔥🔥[Context Distillation] Efficient LLM Context Distillation by @DefTruth in #57
  • Bump up to v2.2 by @DefTruth in #58

Full Changelog: v2.1...v2.2