NOTE: THESE TRAININGS RESOURCES ARE BEING PROVIDED TO ITS FIRST COHORTS AND UNDER ACTIVE DEVELOPMENT
Theiagen Genomics repository for workshop resources, code templates, and exercise materails
This intermediate-level bioinformatics training workshop will provide conceptual and applied training for understanding and utilizing workflow management solutions (e.g. WDL and Nextflow) for interoperable, reproducible, and accessible genomic analysis.
This workshop is designed as a virtual, 4-week series of live-lectures and hands-on exercises. For registered trainees, instructors will be made available for office hours and continued support throughout the duration of this course. All materials--slides, exercises, and recorded lectures--will be made available publicly accessible.
At the conclusion of this course, participants will be able to:
- Understand fundamental concepts, advantages, and disadvantages behind containerization and workflow management systems
- Analyze and assess WDL and Nextflow code bases
- Utilize WDL and Nextflow workflow management systems to integrate multiple analytical modules into a single bioinformatics pipeline
- Publish custom workflows to the Dockstore pipeline repository for integration on the Terra.Bio web application
- Launch Nextflow pipeline on the Nextflow Tower platform
This course is meant for public health bioinformatics scientists with experience accessing and interacting with open-source bioinformatics software through a command-line interface (CLI), version control systems such as Git, and familiarity with the concepts of containerized software systems such as Docker or Singularity. Registered Github accounts are encouraged for the completion of all planned exercises.
Below is a list of helpful resources that we recommend all trainees review, at least in part, prior to the start of this training workshop (listed in order of highest priority):
- StaPH-B Linux Command Sheet
- StaPH-B Docker User Guide
- GitHub Documentation
- WDL Technical Documentation
- Nextflow Technical Documentation
- Learning miniWDL for WDL (@lynnlangit Educational YouTube Series)
Week 1: Introduction to Workflow Management Using WDL
- Lecture Slides & Recorded Session
- Exercise 00: Setting up your environment
- Exercise 01: Creating a WDL Workflow
Week 2: Closer Look at WDL Tasks and Workflows
Week 3: Connecting WDL Workflows with Terra.Bio
Week 4: Getting Started with Conda and Nextflow
- Google Cloud Platform Virtual Machines (GCP VMs) with all pre-requisite software installed will be provisioned to all registered trainees. For those interested in recreating this training with their own compute environment, here is a list of resources required for the completion of each exercise:
Note: All exercises were developed to run on e2-standard-4 GCP VMs (4 CPUs; 16GB RAM) running Ubuntu 20.04.4 LTS (Focal Fossa)