README

This repository contains the scripts (in jupyter notebooks) to generate the figure in the manuscript "BGCFlow: Systematic pangenome workflow for the analysis of biosynthetic gene clusters across large genomic datasets".

USAGE

1. Clone this repository

git clone https://github.com/matinnuhamunada/saccharopolyspora_manuscript.git

2. Set up BGCFlow

# create and activate new conda environment
conda create -n bgcflow pip -y
conda activate bgcflow

# install BGCFlow wrapper
pip install git+https://github.com/NBChub/bgcflow_wrapper.git

# clone BGCFlow to "bgcflow" folder
bgcflow clone bgcflow

2. Download the dataset

Donwload the dataset containing the BGCFlow runs from Zenodo

# move to bgcflow dir
cd bgcflow

# download and extract dataset
wget https://zenodo.org/record/8018055/files/saccharopolyspora_dataset.zip
unzip saccharopolyspora_dataset.zip

3. Set configurations

# go back to the manuscript dir
cd ../saccharopolyspora_manuscript/

# edit the location of the bgcflow dir to the right directory
nano config.yaml

4. Setting up Conda Environments

Install these conda environments:

mamba env create -f python_notebook.yaml
mamba env create -f r_notebook.yaml
mamba env create -f <bgcflow_dir>/workflow/envs/cblaster.yaml

5. Run the notebooks

There are two kind of notebooks, R (.R.ipynb) and python (.python.ipynb)
Run the notebook using the corresponding conda environment: python_notebook or r_notebook
Start jupyter session

# for python
conda activate python_notebook
jupyter lab

# for R
conda activate r_notebook
jupyter lab

Run the notebooks in order

Citation

Matin Nuhamunada, Omkar S. Mohite, Patrick V. Phaneuf, Bernhard O. Palsson, and Tilmann Weber. (2023). BGCFlow: Systematic pangenome workflow for the analysis of biosynthetic gene clusters across large genomic datasets. bioRxiv 2023.06.14.545018; doi: https://doi.org/10.1101/2023.06.14.545018

Nuhamunada, Matin, & Mohite, Omkar Satyavan. (2023). BGCFlow Analysis of Saccharopolyspora Genomes (0.1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8018055

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
assets		assets
.gitignore		.gitignore
00_Fig3b_mash.python.ipynb		00_Fig3b_mash.python.ipynb
01_Fig3a_seqfu.python.ipynb		01_Fig3a_seqfu.python.ipynb
02_Fig3c_automlst-wrapper.R.ipynb		02_Fig3c_automlst-wrapper.R.ipynb
03_Fig3.python.ipynb		03_Fig3.python.ipynb
04_Fig4a.python.ipynb		04_Fig4a.python.ipynb
05_Fig4b_query_bigfam.python.ipynb		05_Fig4b_query_bigfam.python.ipynb
06_Fig4c_arts.python.ipynb		06_Fig4c_arts.python.ipynb
07_Fig4.python.ipynb		07_Fig4.python.ipynb
08_Fig5_overview.python.ipynb		08_Fig5_overview.python.ipynb
09_FigS9cd_overview.python.ipynb		09_FigS9cd_overview.python.ipynb
10_FigS9ef_overview.python.ipynb		10_FigS9ef_overview.python.ipynb
11_FigS9final.python.ipynb		11_FigS9final.python.ipynb
13_Fig6b_lanthipeptide.python.ipynb		13_Fig6b_lanthipeptide.python.ipynb
13_Fig7_Ranthipeptide.python.ipynb		13_Fig7_Ranthipeptide.python.ipynb
13_FigS11_staphylobactin.python.ipynb		13_FigS11_staphylobactin.python.ipynb
14_Fig6a_Lanthipeptide_tree.R.ipynb		14_Fig6a_Lanthipeptide_tree.R.ipynb
15_Fig6cd_Lanthipeptide_erythreapeptin_gggenomes.R.ipynb		15_Fig6cd_Lanthipeptide_erythreapeptin_gggenomes.R.ipynb
16_Fig6_Erythreapeptin_final.python.ipynb		16_Fig6_Erythreapeptin_final.python.ipynb
18_Fig7b_d_Ranthipeptide_gggenomes.R.ipynb		18_Fig7b_d_Ranthipeptide_gggenomes.R.ipynb
19_Fig7_Mycofactocin_final_figure.python.ipynb		19_Fig7_Mycofactocin_final_figure.python.ipynb
20_FigS10_spynosyn_gggenomes.R.ipynb		20_FigS10_spynosyn_gggenomes.R.ipynb
21_FigS6a_seqfu.python.ipynb		21_FigS6a_seqfu.python.ipynb
22_FigS6b.python.ipynb		22_FigS6b.python.ipynb
23_FigS8_prokka_feat.R.ipynb		23_FigS8_prokka_feat.R.ipynb
24_FigS11_staphylobactin_gggenomes.R.ipynb		24_FigS11_staphylobactin_gggenomes.R.ipynb
25_Table_S1.ipynb		25_Table_S1.ipynb
98_Supplementary.ipynb		98_Supplementary.ipynb
99_Zenodo.ipynb		99_Zenodo.ipynb
CITATION.cff		CITATION.cff
LICENSE.txt		LICENSE.txt
README.md		README.md
config.yaml		config.yaml
python_notebook.yaml		python_notebook.yaml
r_notebook.yaml		r_notebook.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

USAGE

1. Clone this repository

2. Set up BGCFlow

2. Download the dataset

3. Set configurations

4. Setting up Conda Environments

5. Run the notebooks

Citation

About

Releases

Packages

Languages

License

matinnuhamunada/saccharopolyspora_manuscript

Folders and files

Latest commit

History

Repository files navigation

README

USAGE

1. Clone this repository

2. Set up BGCFlow

2. Download the dataset

3. Set configurations

4. Setting up Conda Environments

5. Run the notebooks

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages