Skip to content

Dear Internet Explorer user: Your browser is no longer supported

Please switch to a modern browser such as Microsoft Edge, Mozilla Firefox or Google Chrome to view this website's content.

Build a Docker container image

Linux containers are a way to build a self-contained environment that includes software, libraries, and other tools. They’re immensely useful and this guide describes my method for creating them.

The steps for setting-up Docker containers will be as follows:

  1. Install Docker and other software tools
  2. Create a Dockerfile
  3. Build, name and tag the image
  4. Run your container and test
  5. Push to DockerHub
  6. Save the Docker image
  7. Using a Docker image on HPC systems

My workflow is primarily written for users of Windows Subsystem for Linux, but should also be useful for Linux, Windows and Mac users.

Let’s get started…

Some Terminology

An aspect of Docker that often confuses people is the difference between images and containers.

A Docker image (also called an container image) is a read-only (immutable) file that contains the source code, libraries, dependencies, tools, and other files needed for an application to run inside a Docker container. An image can be created from scratch or built on top of a previously existing image, based on the instructions contained in a Dockerfile.

A Docker container is a runtime environment with all the necessary components including code, dependencies, and libraries that are needed to run the application code without using host machine dependencies. This container runtime runs on the engine on a server, machine, or cloud instance.

Docker images can be stored in a Docker registry such as Docker Hub.

A Dockerfile is used to build a Docker image, which is then run to create a Docker container. Docker images can be stored in a Docker registry (such as Docker Hub) and can also be backed-up locally as a tar file.

Install Docker and other software tools

In order to create Docker containers (and thus, images), you’ll need to install Docker Desktop. Versions are available for Windows, Linux and Mac.

I also recommend Visual Studio Code for Windows and Windows Subsystem for Linux users.

Notes for Windows Subsystem for Linux (WSL2) users

If you are using Windows Subsystem for Linux, do not install Docker Desktop in WSL. Instead, install the Windows version which will be able to access your WSL2.

Once installed, start Docker Desktop from the Windows Start menu, then select the Docker icon from the hidden icons menu of your taskbar. Right-click the icon to display the Docker commands menu and select “Change Settings”

Ensure that “Use the WSL 2 based engine” is checked in Settings > General.

Screen capture of Docker Desktop software.
Within Docker Desktop, go to Settings > General and then tick “Use the WSL2 based engine” to make sure Docker works with Windows Subsystem for Linux.

Select from your installed WSL 2 distributions which you want to enable Docker integration on by going to: Settings > Resources > WSL Integration.

Screen capture of Docker desktop
Within Docker Desktop, go to Resources > WSL Integration and then tick “Enable integration with my default WSL distro”, and tick those that apply if you have more than one.

To check that the installation has worked correctly, try the following command in WSL2:

docker --version

Optional: Test that your installation works correctly by running a simple built-in Docker image using: docker run hello-world

Create a Dockerfile

A Dockerfile is a plain text file with keywords that add elements to a Docker image. There are many keywords that can be used in a Dockerfile (documented on the Docker website), but I will keep it simple with the following outline:

Make the Dockerfile

In Visual Studio Code (or a coding editor of your choice), create a new file and save it as Dockerfile (no extension).

Choose a base image with FROM

You don’t need to create everything from scratch. Instead, you may want to choose a “base” image to add things to. For instance, if you’re using Python software, a good starting point might be an “official” Python image. You can search the Docker Hub to locate images.

Once you’ve decided on a base image and version, add it as the first line of your Dockerfile, like this:

FROM repository/image:tag

Some images are maintained by DockerHub itself (these are called “official” images mentioned above), and do not have a repository. As an example, if I wanted to create a container with Python 3.13, I would add this as the first line in my Dockerfile.

FROM python:3.13.0

When possible, you should use a specific tag (not the automatic latest tag) in FROM statements.

Install packaged software with RUN

This step can be a bit tricky. We need to add commands to the Dockerfile to install the desired software. There are a few standard ways to do this:

  1. Use a Linux package manager. This is usually apt-get for Debian-based containers (e.g, Ubuntu) or yum for RedHat Linux containers (e.g., CentOS).
  2. Use a software-specific package manager (like pip or conda for Python).
  3. Use installation instructions (usually a progression of configure, make, make install).

Each of these options will be prefixed by the RUN keyword. You can join together linked commands with the && symbol; to break lines, put a backslash \ at the end of the line. RUN can execute any command inside the image during construction, but keep in mind that the only thing kept in the final image is changes to the filesystem (new and modified files, directories, etc.).

For example, suppose that your job’s executable ends up running Python and needs access to the packages plantcv, as well as the Unix tool wget. Below is an example of a Dockerfile that uses RUN to install these packages using the system package manager (apt-get) and Python’s built-in package manager (pip).

# Build the image based on the official Python v.3.12.7 image
FROM python:3.12.7

# Our base image happens to be Debian-based, so it uses apt-get as its system package manager
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    python3-pip wget

# Use RUN to install Python package PlantCV via pip, Python's package manager
RUN pip install plantcv==4.5.1

One of the benefits of Docker containers is their reproducibility. Therefore, I consider it good practice to specify the version of each piece of software that goes into it when using pip. So, I’d choose pip install plantcv==4.5.1 rather than just pip install plantcv.

If you need to copy specific files (like source code) from your computer into the image, place the files in the same folder as the Dockerfile and use the COPY keyword. You could also download files within the image by using the RUN keyword and commands like wget or git clone.

Set-up your environment with ENV

Your software might rely on certain environment variables being set correctly.

One common situation is that if you’re installing a program to a custom location (like a home directory), you may need to add that directory to the image’s system PATH. For example, if you installed some scripts to /home/software/bin, you could use

ENV PATH="/home/software/bin:${PATH}"

Some useful environmental variables that I have set when creating Python-based containers are shown below:

ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1 \
PIP_BREAK_SYSTEM_PACKAGES=1

So the final Dockerfile will appear as follows:

# Build the image based on the official Python version 3.12.7 image
FROM python:3.12.7

# Set environment variables
ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1 \
    PIP_NO_CACHE_DIR=1 \
    PIP_BREAK_SYSTEM_PACKAGES=1

# Install linux packages via apt-get
RUN apt-get update && \
    apt-get install wget -y --no-install-recommends

# Install Python packages via pip
RUN pip install plantcv==4.5.1

Build, name and tag the image

So far, the image has not been built. I have just compiled instructions on how to build it.

Firstly, it’s very important that you decide on a name for your image, as well as a tag. Tags are important for tracking which version of the image you’ve created (and are using). A simple tag scheme would be to use numbers (e.g. v0, v1, etc.), but you can use any system that makes sense to you. I am going to call my container plantcv and give it the tag 4.5.1a (after the version of PlantCV that it will run).

To build and tag your image, open a terminal and navigate to the folder that contains your Dockerfile:

$ cd path/to/directory

Then make sure Docker is running (there should be an icon on your status bar, and running docker info shouldn’t indicate any errors) and then run:

$ docker build -t username/imagename:tag .

In my case, my Docker username is adamdimech, so the correct command for me will be:

$ docker build -t adamdimech/plantcv:4.5.1a .

If you get errors, try to determine what you may need to add or change to your Dockerfile and then run the build command again. Debugging a Docker build is largely the same as debugging any software installation process.

Run your container and test

You should test your Docker container locally to ensure everything is working as it should. To interact with a Docker container, use the following command:

docker run -it username/image:tag /bin/bash

Replacing the variables with the particulars of your container. In my case:

docker run -it adamdimech/plantcv:4.5.1a /bin/bash

This will start a running copy of the container and start a command line shell inside. You should see your command line prompt change to something like:

root@6ed0ab0aafb4:/#

When you’re ready to leave the container, type exit.

Push to DockerHub

Once your image has been successfully built and tested, you can push it to DockerHub. Pushing it to DockerHub means that it will be available for others to use. It also means that you can have your image installed on another machine, such as a high-performance cluster.

To push to DockerHub, use the following command:

$ docker push username/imagename:tag

If you have not previously logged-in to DockerHub, you may need to run this command beforehand:

$ docker login

Saving your Image

Unfortunately, if you have a free account on DockerHub, any container image that you have pushed there will be scheduled for removal if it is not used (pulled) at least once every 6 months (refer to the Docker Terms of Service). For this reason, it’s a good idea to save your image to a file and storing this somewhere safe. The following code will crate a tarball of your image:

$ docker save --output archive-name.tar username/imagename:tag

Using a Docker image on HPC systems

If you are intending running your container on a high-performance computing system, you may find that your system administrator will not support Docker.

Other programmes like Shifter work better on HPC systems and work seamlessly with Docker containers. For instance, to pull a Docker container, use the following command (using the previous example):

shifter pull adamdimech/plantcv:4.5.1a

Then to access the image, use the –image flag as shown below. You can also pass additional commands to it if required, for instance:

shifter --image=adamdimech/plantcv:4.5.1a plantcv-run-workflow --config "/path/to/config.json"

There are a wide variety of alternatives to Shifter for HPC systems, so check with your system admin for the particulars.

   

Comments

No comments have yet been submitted. Be the first!

Have Your Say

The following HTML is permitted:
<a href="" title=""> <b> <blockquote cite=""> <code> <em> <i> <q cite=""> <strike> <strong>

Comments will be published subject to the Editorial Policy.