Module 3: Containers and Reproducible Builds

Module Overview

"Works on my machine" is a common state of code developed by people lacking in software engineering background. It must be reproducible for code (and science) to work.

We've already learned about pipenv as a Python packaging tool, which goes a long way towards giving reproducible builds - but for even greater reproducibility (and deployability), containers are the tool of choice. A container is a minimal virtual operating system, complete with all the software needed to run the desired application. Because they pack everything together, they are identical to run regardless of host.

Docker is a common standard and tool for containers, and we will use it to build and run Linux containers with Python code.

Learning Objectives

Guided Project

In this guided project, we'll learn how to create Docker containers for reproducible Python environments. Open guided-project.md in the GitHub repository below to follow along with the guided project.

Module Assignment

For this assignment, you'll create Docker containers and Dockerfiles to demonstrate your understanding of containerization and reproducible builds.

Solution Video

Additional Resources

Docker Fundamentals

Advanced Tools

Machine Learning with Docker