Skip to content

Latest commit

 

History

History
65 lines (48 loc) · 4.68 KB

README.md

File metadata and controls

65 lines (48 loc) · 4.68 KB

NOTE: THESE TRAININGS RESOURCES ARE BEING PROVIDED TO ITS FIRST COHORTS AND UNDER ACTIVE DEVELOPMENT


Workflow Management Solutions for Public Health Bioinformatics

Theiagen Genomics repository for workshop resources, code templates, and exercise materails

Course Overview

This intermediate-level bioinformatics training workshop will provide conceptual and applied training for understanding and utilizing workflow management solutions (e.g. WDL and Nextflow) for interoperable, reproducible, and accessible genomic analysis.

Length of Program

This workshop is designed as a virtual, 4-week series of live-lectures and hands-on exercises. For registered trainees, instructors will be made available for office hours and continued support throughout the duration of this course. All materials--slides, exercises, and recorded lectures--will be made available publicly accessible.

Objectives

At the conclusion of this course, participants will be able to:

  • Understand fundamental concepts, advantages, and disadvantages behind containerization and workflow management systems
  • Analyze and assess WDL and Nextflow code bases
  • Utilize WDL and Nextflow workflow management systems to integrate multiple analytical modules into a single bioinformatics pipeline
  • Publish custom workflows to the Dockstore pipeline repository for integration on the Terra.Bio web application
  • Launch Nextflow pipeline on the Nextflow Tower platform

Target Audience

This course is meant for public health bioinformatics scientists with experience accessing and interacting with open-source bioinformatics software through a command-line interface (CLI), version control systems such as Git, and familiarity with the concepts of containerized software systems such as Docker or Singularity. Registered Github accounts are encouraged for the completion of all planned exercises.

Below is a list of helpful resources that we recommend all trainees review, at least in part, prior to the start of this training workshop (listed in order of highest priority):

Course Content

Slides & Exercises

Week 1: Introduction to Workflow Management Using WDL

Week 2: Closer Look at WDL Tasks and Workflows

Week 3: Connecting WDL Workflows with Terra.Bio

Week 4: Getting Started with Conda and Nextflow

Exercise Resource Requirements

  • Google Cloud Platform Virtual Machines (GCP VMs) with all pre-requisite software installed will be provisioned to all registered trainees. For those interested in recreating this training with their own compute environment, here is a list of resources required for the completion of each exercise:

Note: All exercises were developed to run on e2-standard-4 GCP VMs (4 CPUs; 16GB RAM) running Ubuntu 20.04.4 LTS (Focal Fossa)