bulk2sc

bulk2sc is the first framework that provides a solid foundation for generating single-cell data from bulk RNA-seq datasets that learns cell type distributions from single cell reference data. bulk2sc consists of three components: scGMVAE, Bulk Encoder, and genVAE, and they are visualized in the following figure:

Below, we show four UMAPs that demonstrate the cell type clusters are different stages of bulk2sc: raw input data, reparameterized latent representation from GMM parameters $\mu_k$ and $\sigma_k^2$, reconstructed input data, and generated data.

quick start

For a quick start, you can download the PBMC 3K data from the 10X Genomics website and pre-trained Bulk Encoder and scDecoder weights in Google Drive here. To run pre-trained model, simply place the unzipped files inside bulk2sc directory and run

cd bulk2sc
python main.py

custom data

To train with custom data, you will first need to: 0. If cell types are necessary, run scType.R to them. You will need to modify the script for your specific data and filenames.

Modify parameters in utils.py.
Modify main.py to adjust filepath.
Run python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
figures		figures
README.md		README.md
generate.py		generate.py
main.py		main.py
models.py		models.py
requirements.txt		requirements.txt
scType.R		scType.R
train_GMVAE.py		train_GMVAE.py
train_bulkEncoder.py		train_bulkEncoder.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bulk2sc

quick start

custom data

About

Releases

Packages

Languages

berkuva/B2SC

Folders and files

Latest commit

History

Repository files navigation

bulk2sc

quick start

custom data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages