Skip to content

RNA-GPT/RNA-GPT.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RNA-GPT: Multimodal Generative System for RNA Sequence Understanding

MLSB Workshop, Neurips 2024

Homepage: https://rna-gpt.github.io/

RNAs are vital molecules that carry genetic information essential for life and have significant implications for drug development and biotechnology. However, RNA research is often hindered by the vast amount of literature. To address this challenge, we introduce RNA-GPT, a multi-modal RNA chat model that simplifies RNA discovery by leveraging extensive RNA literature. RNA-GPT combines RNA sequence encoders with linear projection layers and state-of-the-art large language models (LLMs) for precise representation alignment. This enables it to process user-uploaded RNA sequences and provide concise, accurate responses. Our scalable training pipeline, powered by RNA-QA, automatically gathers RNA annotations from RNAcentral using a divide-and-conquer approach with GPT-4 and latent Dirichlet allocation (LDA) to handle large datasets and generate instruction tuning samples. Experiments show that RNA-GPT effectively handles complex RNA queries, streamlining RNA research. We also introduce RNA-QA, a dataset of 407,616 RNA sequences for modality alignment and instruction tuning.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published