-
-
Notifications
You must be signed in to change notification settings - Fork 378
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* reconstruct code * show aff;fix latex replace bug; * remove pdb * checkout to specific ref * update readme * fix invalid tar * Retreive arxiv paper from Atom feed (#31) * retrieve from rss * fix bug * fix bug * fix bug * clean code * Release v0.3.0 (#32) * bump version to 0.3.0 * update readme * update uv.lock
- Loading branch information
Showing
10 changed files
with
444 additions
and
334 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,7 @@ dist/ | |
wheels/ | ||
.vscode/ | ||
*.egg-info | ||
.env | ||
|
||
# Virtual environments | ||
.venv | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -33,6 +33,7 @@ | |
## ✨ Features | ||
- Totally free! All the calculation can be done in the Github Action runner locally within its quota (for public repo). | ||
- AI-generated TL;DR for you to quickly pick up target papers. | ||
- Affiliations of the paper are resolved and presented. | ||
- Links of PDF and code implementation (if any) presented in the e-mail. | ||
- List of papers sorted by relevance with your recent research interest. | ||
- Fast deployment via fork this repo and set environment variables in the Github Action Page. | ||
|
@@ -56,7 +57,7 @@ Below are all the secrets you need to set. They are invisible to anyone includin | |
| :--- | :---: | :--- | :--- | :--- | | ||
| ZOTERO_ID | ✅ | str | User ID of your Zotero account. Get your ID from [here](https://www.zotero.org/settings/security). | 12345678 | | ||
| ZOTERO_KEY | ✅ | str | An Zotero API key with read access. Get a key from [here](https://www.zotero.org/settings/security). | AB5tZ877P2j7Sm2Mragq041H | | ||
| ARXIV_QUERY | ✅ | str | The search query for retrieving arxiv papers. Refer to the [official document](https://info.arxiv.org/help/api/user-manual.html#query_details) for details. The example queries papers about AI, CV, NLP, ML. Find the abbr of your research area from [here](https://arxiv.org/category_taxonomy). | cat:cs.AI OR cat:cs.CV OR cat:cs.LG OR cat:cs.CL | | ||
| ARXIV_QUERY | ✅ | str | The categories of target arxiv papers. Use `+` to concatenate multiple categories. The example retrieves papers about AI, CV, NLP, ML. Find the abbr of your research area from [here](https://arxiv.org/category_taxonomy). | cs.AI+cs.CV+cs.LG+cs.CL | | ||
| SMTP_SERVER | ✅ | str | The SMTP server that sends the email. I recommend to utilize a seldom-used email for this. Ask your email provider (Gmail, QQ, Outlook, ...) for its SMTP server| smtp.qq.com | | ||
| SMTP_PORT | ✅ | int | The port of SMTP server. | 465 | | ||
| SENDER | ✅ | str | The email account of the SMTP server that sends you email. | [email protected] | | ||
|
@@ -118,6 +119,9 @@ The TLDR of each paper is generated by a lightweight LLM (Qwen2.5-3b-instruct-q4 | |
- The recommendation algorithm is very simple, it may not accurately reflect your interest. Welcome better ideas for improving the algorithm! | ||
- This workflow deploys an LLM on the cpu of Github Action runner, and it takes about 70s to generate a TLDR for one paper. High `MAX_PAPER_NUM` can lead the execution time exceed the limitation of Github Action runner (6h per execution for public repo, and 2000 mins per month for private repo). Commonly, the quota given to public repo is definitely enough for individual use. If you have special requirements, you can deploy the workflow in your own server, or use a self-hosted Github Action runner, or pay for the exceeded execution time. | ||
|
||
## 👯♂️ Contribution | ||
Any issue and PR are welcomed! But remember that **each PR should merge to the `dev` branch**. | ||
|
||
## 📃 License | ||
Distributed under the AGPLv3 License. See `LICENSE` for detail. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
from llama_cpp import Llama | ||
from openai import OpenAI | ||
from loguru import logger | ||
|
||
GLOBAL_LLM = None | ||
|
||
class LLM: | ||
def __init__(self, api_key: str = None, base_url: str = None, model: str = None): | ||
if api_key: | ||
self.llm = OpenAI(api_key=api_key, base_url=base_url) | ||
else: | ||
self.llm = Llama.from_pretrained( | ||
repo_id="Qwen/Qwen2.5-3B-Instruct-GGUF", | ||
filename="qwen2.5-3b-instruct-q4_k_m.gguf", | ||
n_ctx=32_000, | ||
n_threads=4, | ||
verbose=False, | ||
) | ||
self.model = model | ||
|
||
def generate(self, messages: list[dict]) -> str: | ||
if isinstance(self.llm, OpenAI): | ||
response = self.llm.chat.completions.create(messages=messages,temperature=0,model=self.model) | ||
return response.choices[0].message.content | ||
else: | ||
response = self.llm.create_chat_completion(messages=messages,temperature=0) | ||
return response["choices"][0]["message"]["content"] | ||
|
||
def set_global_llm(api_key: str = None, base_url: str = None, model: str = None): | ||
global GLOBAL_LLM | ||
GLOBAL_LLM = LLM(api_key=api_key, base_url=base_url, model=model) | ||
|
||
def get_llm() -> LLM: | ||
if GLOBAL_LLM is None: | ||
logger.info("No global LLM found, creating a default one. Use `set_global_llm` to set a custom one.") | ||
set_global_llm() | ||
return GLOBAL_LLM |
Oops, something went wrong.