v2.2
What's Changed
- Add NanoFlow code link by @DefTruth in #51
- 🔥[ACTIVATION SPARSITY] TRAINING-FREE ACTIVATION SPARSITY IN LARGE LANGUAGE MODELS by @DefTruth in #52
- 🔥[Decentralized LLM] Decentralized LLM Inference over Edge Networks with Energy Harvesting by @DefTruth in #53
- 🔥[SJF Scheduling] Efficient LLM Scheduling by Learning to Rank by @DefTruth in #54
- 🔥[Speculative Decoding] Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation by @DefTruth in #55
- 🔥🔥[Prompt Compression] Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference by @DefTruth in #56
- 🔥🔥[Context Distillation] Efficient LLM Context Distillation by @DefTruth in #57
- Bump up to v2.2 by @DefTruth in #58
Full Changelog: v2.1...v2.2