Python in Life Sciences: How Python Drives the Analysis of Billions of DNA Sequences
Science
30 minutes
A genomics research center produces larges amount of data per day; a single one of the new Illumina machines for sequencing can produce around 2TB of data composed of millions of files in under 3 days.
The first part will focus on how Python manages the preprocessing and analysis of billions of DNA sequences in a completely automated way. We will also cover how sequencing results are visualized using Flask and MongoEngine to solve medical mysteries in the clinic today.
-
Any Python programmer with interest in how Python is applied to the growing life sciences field of genomics.
-
Any scientist with interest in how other labs are managing the complex data flow and analysis of a genomics facility.
Intermediate
The attendees will learn about a state of the art genomics pipeline and how we use Python to manage, store and analyze large amounts of biologically-significant data.
They will also get a peek behind the website that allows clinicians to productively review results from DNA sequencing and make life-changing diagnoses.
The content of the presentation is structured into 3 different parts.
On each part there is a slides.md
file that accompany the slides for that part,
which are located in the same directory under the name presentation.key
The complete slides set is located here.