A nextflow pipeline that creates (or takes) orthologous groups from gene-called genomes and runs sequence- and structure based QC to remove outliers
You have to define your own dependency profile in nextflow.config unless you have access to the VUB-HPC. Use docker/apptainer to make your life easier.
-
Singularity/Apptainer or Docker
-
Copy of this repository
-
CD-Hit eg. GitHub
-
Biopython
-
BLAST (V2.14)
-
Pandas
-
ESMfold GitHub
-
Copy of SIMSApiper repository
-
IQ-TREE eg. GitHub
--data "$launchDir/data"
--outFolder "$launchDir/"
--structures false [path/to/structure/dir]
--predictRemaining false
--orthoGroupSeqs [path/to/ogseqs/dir]
--minCoverage 0.8
--outGroup false [str_with_og_name]
--foldseekClusters false