The key to scientific discovery in the cloud is an orchestrated, reproducible scientific workflow that combines data from many experimental sources. This reproducibility not only enhances your productivity but also allows others to validate your data and discover the story inside the data. The Science in the Cloud Institute is designed to teach you the tools required to help you explore and find that story, and create a reproducible environment that allows others to explore and validate your findings. In this Institute, we introduce virtualization tools such as Virtual Machines, container technologies such as Docker and Singularity, and orchestration tools such as Kubernetes. Led by TACC's expert domain scientists and developers, we show you hands-on how to use these tools in conjunction with scientific analysis software to develop a production-ready workflow that can easily be applied to your science while also introducing you to the latest hardware that powers TACC's Cloud platform. By the end of this institute, you will be ready to take your science to the cloud.
Participants should bring their laptops and plan to participate actively. Laptops will require a terminal application for accessing compute resources.
Click here for more information about the course instructors.
Time | Topic |
---|---|
9:00 - 10:15 | Cloud Concepts |
10:15 - 10:30 | Break |
10:30 - 11:45 | Deploying VMs and Volumes in Jetstream |
11:45 - 13:00 | Lunch |
13:00 - 14:15 | Introduction to Docker Containers |
14:15 - 14:30 | Break |
14:30 - 16:00 | Docker Compose |
Time | Topic |
---|---|
9:00 - 10:15 | Reproducibility in Research: Version Control |
10:15 - 10:30 | Break |
10:30 - 11:45 | Docker in the Cloud |
11:45 - 13:00 | Lunch |
13:00 - 14:15 | Python 101 |
Power Python | |
Matplotlib | |
Numpy 101 | |
14:15 - 14:30 | Break |
14:30 - 16:00 | Pandas |
Data Science Introduction | |
Disease Propogation |
Pandas101
Matplotlib
Numpy Jacobi
Pandas Example
Austin Traffic*
*Traffic.csv needed for AustinTraffic Notebook
Time | Topic |
---|---|
9:00 - 10:15 | Exercise: Human SNP Analysis Part 1 |
10:15 - 10:30 | Break |
10:30 - 11:45 | Exercise: Human SNP Analysis Part 2 |
11:45 - 13:00 | Lunch |
13:00 - 14:15 | Data Management and Movement |
14:15 - 14:30 | Break |
14:30 - 16:00 | Exercise: Testing and Reproducibility |
Time | Topic |
---|---|
9:00 - 10:15 | Ansible & Kubernetes |
10:15 - 10:30 | Break |
10:30 - 11:45 | TACC as Cloud |
11:45 - 13:00 | Lunch |
13:00 - 14:15 | TACC Tour |
14:15 - 14:30 | Break |
14:30 - 16:00 | One-on-one by request |