Skip to content

Commit

Permalink
🔥[RetrievalAttention] Accelerating Long-Context LLM Inference via Vec…
Browse files Browse the repository at this point in the history
…tor Retrieval (#62)
  • Loading branch information
DefTruth authored Sep 17, 2024
1 parent f0860e8 commit efb983b
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
|2024.01|[Understanding LLMs] Understanding LLMs: A Comprehensive Overview from Training to Inference(@Shaanxi Normal University etc)| [[pdf]](https://arxiv.org/pdf/2401.02038.pdf) | ⚠️|⭐️⭐️ |
|2024.02|[LLM-Viewer] LLM Inference Unveiled: Survey and Roofline Model Insights(@Zhihang Yuan etc)|[[pdf]](https://arxiv.org/pdf/2402.16363.pdf)|[[LLM-Viewer]](https://github.com/hahnyuan/LLM-Viewer) ![](https://img.shields.io/github/stars/hahnyuan/LLM-Viewer.svg?style=social) |⭐️⭐️ |
|2024.07|[**Internal Consistency & Self-Feedback**] Internal Consistency and Self-Feedback in Large Language Models: A Survey|[[pdf]](https://arxiv.org/pdf/2407.14507)| [[ICSF-Survey]](https://github.com/IAAR-Shanghai/ICSFSurvey) ![](https://img.shields.io/github/stars/IAAR-Shanghai/ICSFSurvey.svg?style=social) | ⭐️⭐️ |
|2024.09|🔥[**RetrievalAttention**] RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval(@microsoft.com)|[[pdf]](https://arxiv.org/pdf/2409.10516)|⚠️|⭐️⭐️ |

### 📖LLM Train/Inference Framework/Design ([©️back👆🏻](#paperlist))
<div id="LLM-Train-Inference-Framework"></div>
Expand Down

0 comments on commit efb983b

Please sign in to comment.