plmbind

Description

This project explores the addition of protein side information for the prediction of transcription factor binding sites (TFBSs). This repository contains a baseline model that does not utilize any protein side information and two-branch models that process the DNA info and TF info in dedicated branches. The TF side information can be provided in the form of k-mers or as embeddings creased by protein language models (plm). Esm2 was used in this case, but the model can easily be extended to accept other embeddings. The code for generating the dataset and data splits is also included.

Usage

The main models and dataloader is found in the folder "plmbind" along with a number of training scripts. The code necessary for generating the dataset and data splits can be found under "utils".

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
plmbind		plmbind
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

plmbind

Description

Usage

About

Releases

Packages

Contributors 2

Languages

License

NatanTourne/plmbind_OLD

Folders and files

Latest commit

History

Repository files navigation

plmbind

Description

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages