Skip to content

A toolkit for training deep learning models on genotype, tabular, sequence, image, array and binary data.

License

Notifications You must be signed in to change notification settings

RasmussenLab/EIR

 
 

Repository files navigation

Documentation Status


Supervised modelling, sequence generation, and array output on genotype, tabular, sequence, image, array, and binary input data.

WARNING: This project is in alpha phase. Expect backwards incompatible changes and API changes.

Table of Contents

  1. Install
  2. Usage
  3. Use Cases
  4. Features
  5. Supported Inputs and Outputs
  6. Related Projects
  7. Citation
  8. Acknowledgements

Install

Installing EIR via pip

pip install eir-dl

Important: The latest version of EIR supports Python 3.11. Using an older version of Python will install a outdated version of EIR, which likely be incompatible with the current documentation and might contain bugs. Please ensure that you are installing EIR in a Python 3.11 environment.

Installing EIR via Container Engine

Here's an example with Docker:

docker build -t eir:latest https://raw.githubusercontent.com/arnor-sigurdsson/EIR/master/Dockerfile
docker run -d --name eir_container eir:latest
docker exec -it eir_container bash

Usage

Please refer to the Documentation for examples and information.

Use Cases

EIR allows for training and evaluating various deep-learning models directly from the command line. This can be useful for:

  • Quick prototyping and iteration when doing supervised modelling or sequence generation on new datasets.
  • Establishing baselines to compare against other methods.
  • Fitting on data sources such as large-scale genomics, where DL implementations are not commonly available.

If you are an ML/DL researcher developing new models, etc., it might not fit your use case. However, it might provide a quick baseline for comparison to the cool stuff you are developing, and there is some degree of customization possible.

Features

Supported Inputs and Outputs

Modality Input Output
Genotype x
Tabular x x
Sequence x x
Image x
Array x x
Binary x

† While not directly supported, genotype and image modalities can be treated as arrays. For example see the MNIST Digit Generation tutorial.

Related Projects

  • EIR-auto-GP: Automated genomic prediction (GP) using deep learning models with EIR.

Citation

If you use EIR in a scientific publication, we would appreciate if you could use one of the following citations:

@article{10.1093/nar/gkad373,
    author    = {Sigurdsson, Arn{\'o}r I and Louloudis, Ioannis and Banasik, Karina and Westergaard, David and Winther, Ole and Lund, Ole and Ostrowski, Sisse Rye and Erikstrup, Christian and Pedersen, Ole Birger Vesterager and Nyegaard, Mette and DBDS Genomic Consortium and Brunak, S{\o}ren and Vilhj{\'a}lmsson, Bjarni J and Rasmussen, Simon},
    title     = {{Deep integrative models for large-scale human genomics}},
    journal   = {Nucleic Acids Research},
    month     = {05},
    year      = {2023}
}

@article{sigurdsson2022improved,
    author    = {Sigurdsson, Arnor Ingi and Ravn, Kirstine and Winther, Ole and Lund, Ole and Brunak, S{\o}ren and Vilhjalmsson, Bjarni J and Rasmussen, Simon},
    title     = {Improved prediction of blood biomarkers using deep learning},
    journal   = {medRxiv},
    pages     = {2022--10},
    year      = {2022},
    publisher = {Cold Spring Harbor Laboratory Press}
}

Acknowledgements

Massive thanks to everyone publishing and developing the packages this project directly and indirectly depends on.

About

A toolkit for training deep learning models on genotype, tabular, sequence, image, array and binary data.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%