Skip to content

Commit

Permalink
TheiaCoV and Formatting (#111)
Browse files Browse the repository at this point in the history
Fixes #71 #85 #61


* Create style guide

* Fix formatting

* Fixed formatting

* fix formatting

* removing due to lack of use and difficulty maintaining

* code now included in taxon_id task file

* removing because code is now included in pub repo submission

* removing because code is now included in pub rrepo prep

* fixed formatting

* fixed formatting

* fixed formatting

* Fix formatting

* fixed formatting

* fixed formatting

* fixed formatting

* fixed formatting

* fixed formatting

* fixed formatting

* fix python indent

* fix formatting

* Fix formatting

* fix formatting

* fixing formatting, adding comments

* fixing formatting, adding comments

* fixing formatting, adding comments

* fixed a bracket

* fixing formatting, adding comments

* fixing formatting, adding comments

* fixing formatting

* fixing formatting

* fixed formatting

* fixed formatting

* fix formatting

* fixing formatting

* remove unused workflow

* fix formatting

* fix formatting

* fix formatting

* Fix formatting

* fix formatting

* fix formatting

* fix formatting

* moved ivar lines up to read alingment

* fix formatting

* fix formatting

* fix formatting

* fix formatting

* replace name

* replace name

* replace name

* replace name

* replace name

* replace name

* fix caps

* replace name

* replace name

* replace name

* replace name

* replace names

* replace names

* update workflow

* set ivar default depth min to 100

* set clearlabs default image to artic 1.3.0, medaka 1.4.3

* Update fastq-scan output to num_reads

* update checksums

* fix typo

* set medaka docker image as input variable

* adjust to account for artic 1.3 paths

* Update checksums

* Allow for user-defined reference

* Update freyja output variable names

* allow user-defined metadata and barcodes

* Fix float

* Fix float

* Small tweak

* small tweak

* fix typo

* update freyja ref files from non-root dir

* small tweak

* another tweak

* smaller tweak

* add permissions

* print pwd

* change back to root

* use pushd

* print permissions

* execute with python3

* small tweaks

* fix command

* find lineagePaths.txt

* freyja update from /data

* freyja update for metadata only

* use quay-hosted docker image

* Replaced Titan with TheiaCoV in titan workflows file, replaced name of titan_workflows dir in tables dir

* replace titan with theiacov in tables

* renamed image files

* replace titan with theiacov

* 2.0.0

* 2.0.0

* note about name change on video

* Set update_db to false as default

* Update nextclade and pango images and dataset tag

* fix typo

* Really fix typo

* Update read filter task runtime attributes

* set cpu by variable

* Reduce runtime disk sizes

* Update default docker image

* Update docker image

Co-authored-by: frankambrosio3 <[email protected]>
  • Loading branch information
kevinlibuit and frankambrosio3 authored Feb 16, 2022
1 parent 65f4df0 commit a6df039
Show file tree
Hide file tree
Showing 101 changed files with 3,937 additions and 4,796 deletions.
32 changes: 16 additions & 16 deletions .dockstore.yml
Original file line number Diff line number Diff line change
@@ -1,23 +1,23 @@
version: 1.2
workflows:
- name: Titan_ClearLabs
- name: TheiaCoV_ClearLabs
subclass: WDL
primaryDescriptorPath: /workflows/wf_titan_clearlabs.wdl
primaryDescriptorPath: /workflows/wf_theiacov_clearlabs.wdl
testParameterFiles:
- empty.json
- name: Titan_ONT
- name: TheiaCoV_ONT
subclass: WDL
primaryDescriptorPath: /workflows/wf_titan_ont.wdl
primaryDescriptorPath: /workflows/wf_theiacov_ont.wdl
testParameterFiles:
- empty.json
- name: Titan_Illumina_PE
- name: TheiaCoV_Illumina_PE
subclass: WDL
primaryDescriptorPath: /workflows/wf_titan_illumina_pe.wdl
primaryDescriptorPath: /workflows/wf_theiacov_illumina_pe.wdl
testParameterFiles:
- empty.json
- name: Titan_Illumina_SE
- name: TheiaCoV_Illumina_SE
subclass: WDL
primaryDescriptorPath: /workflows/wf_titan_illumina_se.wdl
primaryDescriptorPath: /workflows/wf_theiacov_illumina_se.wdl
testParameterFiles:
- empty.json
- name: Mercury_PE_Prep
Expand All @@ -35,14 +35,14 @@ workflows:
primaryDescriptorPath: /workflows/wf_mercury_batch.wdl
testParameterFiles:
- empty.json
- name: Titan_Augur_Prep
- name: TheiaCoV_Augur_Prep
subclass: WDL
primaryDescriptorPath: /workflows/wf_titan_augur_prep.wdl
primaryDescriptorPath: /workflows/wf_theiacov_augur_prep.wdl
testParameterFiles:
- empty.json
- name: Titan_Augur_Run
- name: TheiaCoV_Augur_Run
subclass: WDL
primaryDescriptorPath: /workflows/wf_titan_augur_run.wdl
primaryDescriptorPath: /workflows/wf_theiacov_augur_run.wdl
testParameterFiles:
- empty.json
- name: Pangolin_Update
Expand All @@ -65,14 +65,14 @@ workflows:
primaryDescriptorPath: /workflows/wf_ncbi_scrub_pe.wdl
testParameterFiles:
- empty.json
- name: Titan_FASTA
- name: TheiaCoV_FASTA
subclass: WDL
primaryDescriptorPath: /workflows/wf_titan_fasta.wdl
primaryDescriptorPath: /workflows/wf_theiacov_fasta.wdl
testParameterFiles:
- empty.json
- name: Titan_WWVC
- name: TheiaCoV_WWVC
subclass: WDL
primaryDescriptorPath: /workflows/wf_titan_wwvc.wdl
primaryDescriptorPath: /workflows/wf_theiacov_wwvc.wdl
testParameterFiles:
- empty.json
- name: Freyja_FASTQ
Expand Down
58 changes: 58 additions & 0 deletions .github/workflows/theiacov-gc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: theiacov-gc
on: [push, pull_request]

jobs:
theiacov-gc:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
wf: ['clearlabs', 'illumina_pe', 'illumina_se', 'ont']
defaults:
run:
shell: bash -l {0}
steps:
- name: Checkout PHVG
uses: actions/checkout@v2

- name: Free up Disk Space
run: bash ${GITHUB_WORKSPACE}/.github/helpers/free-disk-space.sh

- name: Setup miniconda
uses: conda-incubator/setup-miniconda@v2
with:
activate-environment: theiacov-gc
auto-activate-base: false

- name: Setup TheiaCoV CI Environment
run: |
conda install -y -c conda-forge -c bioconda cromwell 'python>=3.7' pytest pytest-workflow wget
chmod 755 bin/*
cp bin/* ${CONDA_PREFIX}/bin
THEIACOV_GC_VERSION=$(grep "PHVG_Version=" tasks/task_versioning.wdl | sed -E 's/.*="PHVG v(.*)"/\1/')
echo "THEIACOV_GC_VERSION=${THEIACOV_GC_VERSION}" >> $GITHUB_ENV
THEIACOV_SHARE="${CONDA_PREFIX}/share/theiacov-gc-${THEIACOV_GC_VERSION}"
mkdir -p ${THEIACOV_SHARE}
mv conf/ tasks/ workflows/ ${THEIACOV_SHARE}
- name: Environment Information
run: uname -a && env && theiacov-gc -h

- name: Test TheiaCoV-GC Workflows
run: |
mkdir -p theiacov/${{ matrix.wf }}
theiacov-gc-prepare.py tests/data/fastqs/${{ matrix.wf }} ${{ matrix.wf }} tests/data/primers/artic-v3.primers.bed > theiacov/${{ matrix.wf }}.json
TMPDIR=~ pytest --symlink --kwdof --tag theiacov_${{ matrix.wf }}
rm -rf theiacov/${{ matrix.wf }}
- name: Upload logs on failure
if: failure()
uses: actions/upload-artifact@v2
with:
name: logs-${{ matrix.wf }}
path: |
/home/runner/pytest_workflow_*/*/theiacov/
/home/runner/pytest_workflow_*/*/log.out
/home/runner/pytest_workflow_*/*/log.err
!/home/runner/pytest_workflow_*/*/theiacov/*/alignments/*.bam*
!/home/runner/pytest_workflow_*/*/theiacov/*/dehosted_reads/*.fastq.gz
58 changes: 0 additions & 58 deletions .github/workflows/titan-gc.yml

This file was deleted.

71 changes: 69 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,72 @@ Bioinformatics workflows for genomic characterization, submission preparation, a
### Contributors & Influence
* Based on collaborative work with Andrew Lang, PhD & his [Genomic Analysis WDL workflows](https://github.com/AndrewLangvt/genomic_analyses)
* Workflows and task development influenced by The Broad's [Viral Pipes](https://github.com/broadinstitute/viral-pipelines)
* Titan workflows for genomic characterization influenced by UPHL's [Cecret](https://github.com/UPHL-BioNGS/Cecret) & StaPH-B's [Monroe](https://staph-b.github.io/staphb_toolkit/workflow_docs/monroe/)
* The Titan workflow for waste water variant calling (Titan_WWVC) incorporates a modified version of the [CDPHE's WasteWaterVariantCalling WDL Worfklow](https://github.com/CDPHE/WasteWaterVariantCalling).
* TheiaCoV workflows for genomic characterization influenced by UPHL's [Cecret](https://github.com/UPHL-BioNGS/Cecret) & StaPH-B's [Monroe](https://staph-b.github.io/staphb_toolkit/workflow_docs/monroe/)
* The TheiaCoV workflow for waste water variant calling (TheiaCoV_WWVC) incorporates a modified version of the [CDPHE's WasteWaterVariantCalling WDL Worfklow](https://github.com/CDPHE/WasteWaterVariantCalling).

### Repository Style Guide
2-space indents (no tabs), braces on same line, single space when defining input/output variables & runtime attributes, single-line breaks between non-intended constructs, and task commands enclosed with triple braces (`<<< ... >>>`).

<em>E.g.</em>:
```
workflow w {
input {
String input
}
call task_01 {
input:
input = input
}
call task_02 {
input:
input = input
}
output {
File task_01_out = task_01.output
File task_02_out = task_02.output
}
}
task task1 {
input {
String input
String docker = "theiagen/utility:1.1"
}
command <<<
echo '~{input}' > output.txt
>>>
output {
File output = "output.txt"
}
runtime {
docker: docker
memory: "8 GB"
cpu: 2
disks "local-disk 100 SSD"
preemptible: 0
maxRetries: 0
}
}
task task_02 {
input{
String input
String docker = "theiagen/utility:1.1"
}
command <<<
echo '~{input}' > output.txt
>>>
output {
File output = "output.txt"
}
runtime {
docker: docker
memory: "8 GB"
cpu: 2
disks "local-disk 100 SSD"
preemptible: 0
maxRetries: 0
}
}
```
Style guide inspired by [scottfrazer](https://gist.github.com/scottfrazer)'s [WDL Best Pratcices Style Guide](https://gist.github.com/scottfrazer/aa4ab1945a6a4c331211)
44 changes: 22 additions & 22 deletions bin/titan-gc → bin/theiacov-gc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#! /bin/bash
# usage: titan-gc [-h] [-i STR] [--inputs STR] [-o STR] [--outdir STR] [--options STR] [--verbose]
# usage: theiacov-gc [-h] [-i STR] [--inputs STR] [-o STR] [--outdir STR] [--options STR] [--verbose]
#
# titan-gc - Run Titan GC on a set of samples.
# theiacov-gc - Run theiacov GC on a set of samples.
#
# required arguments:
# -i STR, --inputs STR The JSON file to be used with Cromwell for inputs.
Expand All @@ -11,29 +11,29 @@
# -h, --help show this help message and exit
#
# --options STR JSON file containing Cromwell options
# --verbose Print out all STDOUT from Cromwell and titan-organize
# --verbose Print out all STDOUT from Cromwell and theiacov-organize
set -e
set -u
OPTIONS="0"
QUIET="0"
PROFILE="docker"
CONFIG="0"
TITAN_PATH=$(which titan-gc | sed 's=bin/titan-gc==')
TITAN_SHARE=${TITAN_PATH}/share/titan-gc-${TITAN_GC_VERSION}
THEIACOV_PATH=$(which theiacov-gc | sed 's=bin/theiacov-gc==')
THEIACOV_SHARE=${THEIACOV_PATH}/share/theiacov-gc-${THEIACOV_GC_VERSION}
CROMWELL_JAR=$(which cromwell | sed 's=bin/cromwell=share/cromwell/cromwell.jar=')
LOG_LEVEL=ERROR
SINGULARITY_CACHE="${HOME}/.singularity/titan-cache"
CROMWELL_OPTS="${TITAN_SHARE}/conf/options.json"
SINGULARITY_CACHE="${HOME}/.singularity/theiacov-cache"
CROMWELL_OPTS="${THEIACOV_SHARE}/conf/options.json"

version() {
echo "titan-gc ${TITAN_GC_VERSION}"
echo "theiacov-gc ${THEIACOV_GC_VERSION}"
exit 0
}

usage() {
echo "usage: titan-gc [-h] [-i STR] [--inputs STR] [-o STR] [--outdir STR] [--options STR] [--quiet]"
echo "usage: theiacov-gc [-h] [-i STR] [--inputs STR] [-o STR] [--outdir STR] [--options STR] [--quiet]"
echo ""
echo "titan-gc - Run Titan GC on a set of samples."
echo "theiacov-gc - Run TheiaCoV GC on a set of samples."
echo ""
echo "required arguments:"
echo " -i STR, --inputs STR The JSON file to be used with Cromwell for inputs."
Expand All @@ -46,7 +46,7 @@ usage() {
echo " --profile STR The backend profile to use [options: docker, singularity]"
echo " --config STR Custom backend profile to use"
echo " --cromwell_jar STR Path to cromwell.jar (Default use conda install)"
echo " --quiet Silence all STDOUT from Cromwell and titan-gc-organize"
echo " --quiet Silence all STDOUT from Cromwell and theiacov-gc-organize"

if [ -n "$1" ]; then
exit "$1"
Expand Down Expand Up @@ -83,9 +83,9 @@ if [[ "${CONFIG}" == "0" ]]; then
# Use built in config
if [[ "${PROFILE}" == "docker" ]]; then
# Default
CONFIG_PATH="${TITAN_SHARE}/conf/docker.config"
CONFIG_PATH="${THEIACOV_SHARE}/conf/docker.config"
elif [[ "${PROFILE}" == "singularity" ]]; then
CONFIG_PATH="${TITAN_SHARE}/conf/singularity.config"
CONFIG_PATH="${THEIACOV_SHARE}/conf/singularity.config"
else
echo "Uknown profile: ${PROFILE}, exiting..."
usage 1
Expand All @@ -97,23 +97,23 @@ if [[ "${OPTIONS}" != "0" ]]; then
fi

mkdir -p ${OUTDIR}
echo "Running Titan GC (use --quiet to quiet things down a bit)" 1>&2
echo "Running TheiaCoV GC (use --quiet to quiet things down a bit)" 1>&2
if [[ ${QUIET} == "0" ]]; then
java -Dconfig.file=${CONFIG_PATH} -jar ${CROMWELL_JAR} run \
-i ${INPUTS} \
-m ${OUTDIR}/titan-metadata.json \
-m ${OUTDIR}/theiacov-metadata.json \
-o ${CROMWELL_OPTS} \
${TITAN_SHARE}/workflows/wf_titan_gc.wdl 2> ${OUTDIR}/cromwell-stderr.txt | tee ${OUTDIR}/cromwell-stdout.txt
${THEIACOV_SHARE}/workflows/wf_theiacov_gc.wdl 2> ${OUTDIR}/cromwell-stderr.txt | tee ${OUTDIR}/cromwell-stdout.txt
else
java -Dconfig.file=${CONFIG_PATH} -jar ${CROMWELL_JAR} run \
-i ${INPUTS} \
-m ${OUTDIR}/titan-metadata.json \
-o ${CROMWELL_OPTS} ${TITAN_SHARE}/workflows/wf_titan_gc.wdl > ${OUTDIR}/cromwell-stdout.txt 2> ${OUTDIR}/cromwell-stderr.txt
-m ${OUTDIR}/theiacov-metadata.json \
-o ${CROMWELL_OPTS} ${THEIACOV_SHARE}/workflows/wf_theiacov_gc.wdl > ${OUTDIR}/cromwell-stdout.txt 2> ${OUTDIR}/cromwell-stderr.txt
fi

if [[ -f "${OUTDIR}/titan-metadata.json" ]]; then
echo "Titan GC complete, organizing outputs" 1>&2
titan-gc-organize.py ${OUTDIR}/titan-metadata.json --outdir ${OUTDIR}
if [[ -f "${OUTDIR}/theiacov-metadata.json" ]]; then
echo "TheiaCoV GC complete, organizing outputs" 1>&2
theiacov-gc-organize.py ${OUTDIR}/theiacov-metadata.json --outdir ${OUTDIR}
else
echo "Titan GC did not complete sucessfully, please review the logs (${OUTDIR}/cromwell-std{err|out}.txt)" 1>&2
echo "TheiaCoV GC did not complete sucessfully, please review the logs (${OUTDIR}/cromwell-std{err|out}.txt)" 1>&2
fi
Loading

0 comments on commit a6df039

Please sign in to comment.