Skip to content
This repository has been archived by the owner on Sep 19, 2024. It is now read-only.

Commit

Permalink
Merge remote-tracking branch 'origin/dev'
Browse files Browse the repository at this point in the history
  • Loading branch information
jkobject committed Jul 24, 2024
2 parents 3fdc1d4 + b3a9c04 commit a713d1a
Show file tree
Hide file tree
Showing 235 changed files with 487,454 additions and 5,507 deletions.
10 changes: 0 additions & 10 deletions .github/backup/requirements-test.txt

This file was deleted.

5 changes: 0 additions & 5 deletions .github/backup/requirements.txt

This file was deleted.

46 changes: 0 additions & 46 deletions .github/backup/setup.py

This file was deleted.

68 changes: 0 additions & 68 deletions .github/init.sh

This file was deleted.

42 changes: 0 additions & 42 deletions .github/workflows/rename_project.yml

This file was deleted.

39 changes: 39 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -131,3 +131,42 @@ dmypy.json
# templates
.github/templates/*
.DS_Store
data/temp/*
.sync-conflict*
notebooks/tests/mytest2
notebooks/tests/mytest
notebooks/tests/wandb
data/GroundTruth/*
data/adamson
data/fasta/*
data/scGPT_human
wandb/*
*/lightning_logs/*
*.lock
data/tensorboard/*
notebooks/tests/*/
data/logs/*
.lr_find_*
step_*_.*
data/main/embeddings.parquet
slurm-*.out
notebooks/tests/step_0_.h5ad
data/main/gene_embeddings.parquet
*.h5ad
data/main/main_scenic+.parquet
scdataloader.out
data/main/9606.protein.links.v12.0.txt.gz
data/main/motifs-v10-nr.hgnc-m0.00001-o0.0.tbl
data/main/main_scenic+_database.feather
*.npy
*.joblib
notebooks/tests/collator_output.txt
notebooks/tests/curr_genes_mouse.csv
notebooks/tests/curr_genes.csv
notebooks/tests/precision_recall_plot.png
notebooks/tests/lightning_logs/*
notebooks/additional/lightning_logs/*
lightning_logs/
collator_output.txt
data/bias_sparse.npz
todel/
6 changes: 6 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[submodule "scDataLoader"]
path = scDataLoader
url = https://github.com/jkobject/scDataLoader
[submodule "RNABERT"]
path = RNABERT
url = https://github.com/jkobject/RNABERT
95 changes: 73 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,47 +1,98 @@

# scprint
# scprint: Large Cell Model for scRNAseq data

[![codecov](https://codecov.io/gh/jkobject/scPRINT/branch/main/graph/badge.svg?token=scPRINT_token_here)](https://codecov.io/gh/jkobject/scPRINT)
[![CI](https://github.com/jkobject/scPRINT/actions/workflows/main.yml/badge.svg)](https://github.com/jkobject/scPRINT/actions/workflows/main.yml)
[![PyPI version](https://badge.fury.io/py/scprint.svg)](https://badge.fury.io/py/scprint)
[![Documentation Status](https://readthedocs.org/projects/scprint/badge/?version=latest)](https://scprint.readthedocs.io/en/latest/?badge=latest)
[![Downloads](https://pepy.tech/badge/scprint)](https://pepy.tech/project/scprint)
[![Downloads](https://pepy.tech/badge/scprint/month)](https://pepy.tech/project/scprint)
[![Downloads](https://pepy.tech/badge/scprint/week)](https://pepy.tech/project/scprint)
[![GitHub issues](https://img.shields.io/github/issues/jkobject/scPRINT)](https://img.shields.io/github/issues/jkobject/scPRINT)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![DOI](https://zenodo.org/badge/391909874.svg)](https://zenodo.org/badge/latestdoi/391909874)

Awesome scprint created by Jeremie Kalfon
![logo](logo.png)

using:
[pytorch template](https://github.com/victoresque/pytorch-template)
[python template](https://github.com/rochacbruno/python-project-template)
scPRINT is a novel transformer model for the inference of gene regulatory network from scRNAseq data. It uses novel encoding and decoding schemes as well as new pre-training methodologies to learn a model of the cell. But scPRINT can do lots of things: [Read the paper!]()

![figure1](figure1.png)

## Install it from PyPI

If you want to be using flashattention2, know that it only supports torch==2.0.0 for now.

🚨 **Important Notice:** Only the **development install** currently works (see [dev mode](#in-dev-mode))! 🚨

```bash
pip install 'lamindb[jupyter,bionty]'
```

then install scPrint

```bash
pip install scprint
```
> if you have a GPU that you want to use, you will benefit from flashattention. and you will have to do some more specific installs:
1. find the version of torch 2.0.0 / torchvision 0.15.0 / torchaudio 2.0.0 that match your nvidia drivers in the torch website.
2. apply the install command
3. do `pip install pytorch-fast-transformers torchtext==0.15.1`
4. do `pip install triton==2.0.0.dev20221202 --no-deps`

You should be good to go. You need those specific versions for everything to work..
not my fault, scream at nvidia, pytorch, Tri Dao and OpenAI :wink:


### in dev mode

```python
conda create ...
git clone https://github.com/jkobject/scPRINT
git clone https://github.com/jkobject/GRnnData
git clone https://github.com/jkobject/benGRN
cd scPRINT
git checkout dev
git submodule init
git submodule update
pip install -e scDataloader
pip install -e ../GRnnData/
pip install -e ../benGRN/
tall torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1
# install pytorch as mentionned above if you have a GPU
pip install -e .[dev]
pip install 'lamindb[jupyter,bionty]'
pip install triton==2.0.0.dev20221202 --no-deps
```

## Usage

```py
from scprint import BaseClass
from scprint import base_function
from lightning.pytorch import Trainer
from scprint import scPrint
from scdataloader import DataModule

BaseClass().base_method()
base_function()
datamodule = DataModule(...)
model = scPrint(...)
trainer = Trainer(...)
trainer.fit(model, datamodule=datamodule)
...
```

or

```bash
$ python -m scprint
#or
$ scprint
$ scprint fit/train/predict/test --config config/[medium|large|vlarge] ...
```

for more information on usage please see the documentation in [https://www.jkobject.com/scPrint/](https://www.jkobject.com/scPrint/)

## Development

Read the [CONTRIBUTING.md](CONTRIBUTING.md) file.

### What is included?
acknowledgement:
[python template](https://github.com/rochacbruno/python-project-template)
[laminDB](https://lamin.ai/)
[Lightning](https://lightning.ai/)

Awesome Large Cell Model created by Jeremie Kalfon.

- 📃 Documentation structure using [mkdocs](http://www.mkdocs.org)
- 🧪 Testing structure using [pytest](https://docs.pytest.org/en/latest/)
If you want [codecov](https://about.codecov.io/sign-up/) Reports and Automatic Release to [PyPI](https://pypi.org)
On the new repository `settings->secrets` add your `PYPI_API_TOKEN` and `CODECOV_TOKEN` (get the tokens on respective websites)
- ✅ Code linting using [flake8](https://flake8.pycqa.org/en/latest/)
- 📊 Code coverage reports using [codecov](https://about.codecov.io/sign-up/)
- 🛳️ Automatic release to [PyPI](https://pypi.org) using [twine](https://twine.readthedocs.io/en/latest/) and github actions.
1 change: 1 addition & 0 deletions RNABERT
Submodule RNABERT added at 32904a
Loading

0 comments on commit a713d1a

Please sign in to comment.