3 ways to build docker images with ansible

It's been over a year since I ditched Chef for Ansible as my config management tool of choice. Now I am really interested in Docker. Docker provides a simple abstraction for working with containers, but working with docker involves running a lot of commands manually. Maybe ansible can help us automate working with docker.

In this post, I will show you three different ways to build docker containers with ansible. We will see some of the tradeoffs involved with each approach and discover whether or not ansible is a good tool for building docker images.

How to follow along

If you want to follow along, you should clone the example repo and use the provided vagrant box. You also need to have the latest version of ansible installed. I am using ansible 1.9.1 and docker 1.6.2 in this post.

git clone [email protected]:jbgo/ansible-build-docker-images.git
cd ansible-build-docker-images
vagrant up

I do not recommend using docker machine because it uses boot2docker, which does not come with python installed. Ansible requires a working python installation on the managed hosts. It is typically safe to assume that a linux distro comes with python, but it is now popular to run minimalistic distros for the docker host.

A simple nginx site

We are going to use three different techniques to build a docker container for a static site. The build process involves installing nginx, replacing the nginx configuration file with our own, and copying the index.html file to the appropriate directory. When running, the site looks like this.

a simple nginx site

I will refer to the files required to build the site as the site files. All of the site files are in the site directory.

Building with the docker_image module (deprecated)

My first instict is to use the docker_image module since ansible includes it. The module docs contain a deprecation warning, but I want to try it anyways.

Let's start by uploading the site directory to the docker host. This is the first task in example1.yml.

- name: upload the site directory to the docker host
  synchronize: src=site dest=/tmp

This directory contains my nginx configuration, an index.html file, and a Dockerfile describing how to build the image.

FROM ubuntu:latest

RUN apt-get install -y nginx
ADD nginx.conf /etc/nginx/nginx.conf

RUN mkdir -p /var/www/site
ADD index.html /var/www/site/index.html

EXPOSE 80

CMD ["nginx", "-g", "daemon off;"]

With the site directory on the docker host, we can use the docker_image module to build it.

- name: build the image
  docker_image: >
    name=built-by-ansible
    tag=ex1
    path=/tmp/site
    state=present

We can use the docker module, which is fully supported by ansible, to run a container from the new image, built-by-ansible:ex1.

- name: run the site in a docker container
      docker:
        name: site1
        image: "built-by-ansible:ex1"
        state: reloaded
        publish_all_ports: yes

We name the running container site1 and use the publish_all_ports option to make any exposed container ports (port 80 for our site container) available via a random host port.

It's time to run [example1.yml].

ansible-playbook -i hosts example1.yml

Using the docker ps command, we can see which host port docker chose.

$ vagrant ssh -c 'docker ps'
CONTAINER ID        IMAGE                  COMMAND                PORTS                   NAMES
dc2f2b4b0874        built-by-ansible:ex1   "nginx -g 'daemon of   0.0.0.0:32768->80/tcp   site1

You can visit http://192.168.33.100:32768/ in your browser to view the site. If you are following along, you will likely have a different port.

One disadvantage to using the docker_image module is that it does not rebuild the image when the site files change. For example, if we were to change the Dockerfile or the nginx config file, we would need to rebuild the image. Instead, ansible will only rebuild the image when the module name or tag options change.

We could try to work around this by rebuilding the image whenever the synchronize module indicates a change. All we need to do is register the result of the sync task by adding register: sync_result, then conditionally run the build task by adding when: sync_result.changed.

- name: upload the site directory to the docker host
  synchronize: src=site dest=/tmp
  register: sync_result

- name: build the image
  docker_image: >
    name=built-by-ansible
    tag=ex1
    path=/tmp/site
    state=present
  when: sync_result.changed

Now our image gets rebuilt when we change site files, but the build task gets skipped entirely if we change nothing. If we change the image name or tag but not the site files, our image will not be rebuilt. That's not what we want!

One final reason not to use the docker_image module, besides being deprecated, is that it does not support communicating with docker over https.

Building with the command module

At this point, I've hopefully convinced you not to use the docker_image module. It may be better to run the docker build command directly. Typically you want to use an ansible module whenever one is available because modules are idempotent. Modules will only run if the actual and desired states are different. However, the docker build command has it's own built-in caching mechanism to help speed up subsequent builds by only rebuilding image layers that require a change, so using an idempotent module is redundant.

Let's replace the docker_image module with the command module.

- name: build the image
  command: docker build -t built-by-ansible:ex2 /tmp/site

That's it. We lose idempotency, but we ensure that are image is always built correctly. Time to run example2.yml

ansible-playbook -i hosts example2.yml

Again, you can use vagrant ssh -c 'docker ps' to find the host port for the container and view it in your browser. Notice that we've tagged this image :ex2 and named the container site2.

Another option available to you when using the command module is that it allows you to run your tasks locally and communicate with a remote docker host over https. For example, if you install docker machine, you can skip the synchronize task and run the tasks on localhost.

---
- name: Build an image with the command module for a remote docker host
  hosts: localhost
  connection: local

  tasks:
    - name: build the image
      command: docker build -t built-by-ansible:ex2b ./site

    - name: run the site in a docker container
      docker:
        name: site2b
        image: "built-by-ansible:ex2b"
        state: reloaded
        publish_all_ports: yes
        use_tls: encrypt

To get the docker module to work with docker machine, I had to use use_tls: encrypt instead of use_tls: verify (verify is more secure). I plan to revisit the docker module soon to understand why I had to do that and if there is a way to use verify. If you know why, please let me know in the comments.

To run this playbook, you will need to make sure you can communicate with your docker machine. I'm using the dev machine created during the default installation.

eval $(docker-machine env dev)

This sets up some environment variables to tell your local docker client how to communicate with the remote docker host. Once that's done, you can run example2b.yml.

ansible-playbook -i hosts example2b.yml

That's it for docker machine today. The rest of the playbooks in this post will be run on our vagrant box.

Running a playbook inside a container

So far, our approach involves invoking docker's build system from ansible. The Dockerfile approach is simple and works well, but it lacks a many of the features ansible provides. And if you already have ansible roles and playbooks for building your app, you might prefer to skip the process of translating those into dockerfiles.

Ansible provides a docker image with ansible already installed, so let's try it. The example provided by ansible still involves the use of a dockerfile. This may be a good idea, but for now I don't want to use a dockerfile at all. To accomplish this requires a 2-step build process.

Use docker run to run ansible in a container.
Use docker commit to create an image from the container in step 1.

Additionally, I would like to remove the container I used to build my site because I don't need it anymore. But if the build fails, I would like to keep the container so I can debug the failure. Because docker doesn't allow duplicate container names, I will use a timestamp to name the container I run ansible in.

- name: create a unique temp container name
  set_fact:
    temp_container_name: ex3_build_{{lookup('pipe', 'date "+%Y%m%d%H%M%S"')}}

I'm using the set_fact module and with a lookup to create the temp_container_name fact. I can now use this fact in later tasks. I originally tried to create this as a variable, not as a fact, but surprisingly ansible would re-evaluate the lookup at the time of each task, so the timestamp changed from task to task.

Next I run ansible in a docker container.

- name: build site by running ansible in a docker container
  command: "docker run
    -v /tmp/site:/site
    -w /site
    --name={{temp_container_name}}
    ansible/ubuntu14.04-ansible:latest
    ansible-playbook playbook.yml -c local"

Let's breakdown those options. (see also: docker help run).

docker run
run a command in a new container
-v /tmp/site:/site
mount /tmp/site from the docker host as /site in the container. This is how we can access our site files from within the container.
-w /site
set /site as the working directory in our container. The playbook we run inside the container is in this directory.
--name=
name our container so we can reference it in later tasks
ansible/ubuntu14.04-ansible:latest
the official ansible base image
ansible-playbook playbook.yml -c local
the command to run

Here are the contents of playbook.yml. We run this playbook inside the docker container to build our site. There is nothing special about this playbook, but it's important that we run it for localhost with a local connection.

---
- name: configure the container with ansible
  hosts: localhost
  tasks:
    - name: install nginx
      apt: pkg=nginx state=installed

    - name: create the site directory
      file: dest=/var/www/site state=directory recurse=yes

    - name: copy the site directory in place
      synchronize: src=/site dest=/var/www

    - name: add the nginx configuration
      template: src=/site/nginx.conf dest=/etc/nginx/nginx.conf

If the task fails, we can use docker ps -a to find our container for troubleshooting. Assuming the task succeeds, we can create an image from our container.

- name: create a docker image from the container
  command: "docker commit
    -c 'EXPOSE 80'
    -c 'CMD [\"nginx\", \"-g\", \"daemon off;\"]'
    {{temp_container_name}}
    built-by-ansible:ex3"

Again, let's break this command down line-by-line.

docker commit
create a new image including the changes in the container's read-write layer
-c 'EXPOSE 80'
containers will listen on port 80
-c 'CMD [\"nginx\", \"-g\", \"daemon off;\"]'
run nginx as the default command
``
the name of the container we ran ansible in
built-by-ansible:ex3
the name and tag of our image

At this point we've successfully built our image, but to keep things tidy, let's delete the container we used for running ansible. We don't need this container anymore because we will start a new one based on our image instead.

- name: delete the container once the image has been successfully built
  command: docker rm -f -v {{temp_container_name}}

Now we are ready to run example3.yml.

ansible-playbook -i hosts example3.yml

This example is more complex than building with a dockerfile. However, it may be worth it if your application has a unique and complicated setup process and you already have the ansible roles and playbooks written. I suspect that as it becomes more common to run apps in containers, we will see fewer special snowflake applications and more apps standardize around common stack-specific base images.

There are a couple of drawbacks to this approach (besides the additional complexity).

Despite building the exact same site, the dependencies required to run ansible in a container doubles the image size. You can see below that :ex3 is 482 MB while :ex1 and :ex2 are only 206 MB.

vagrant@docker:~$ docker images | grep ex
built-by-ansible              ex3                 c23ebd13b101        12 minutes ago      482.9 MB
built-by-ansible              ex2                 6394d2059085        32 minutes ago      206.4 MB
built-by-ansible              ex1                 810b757f08f5        2 hours ago         206.4 MB

Also, we build the entire image from scratch on every build, which is not very efficient. We can improve by checking for an existing image and running ansible in a container started from that image instead. Here are the changes we need to make:

- name: check for existence of our base image
   shell: "docker images | grep built-by-ansible | grep ex3"
   ignore_errors: yes
   register: image_check

- set_fact:
    base_image: "{{'built-by-ansible:ex3' if image_check.rc == 0 else 'ansible/ubuntu14.04-ansible:latest'}}"

- name: build site by running ansible in a docker container
  command: "docker run -v /tmp/site:/site -w /site --name={{temp_container_name}}
    {{base_image}} ansible-playbook playbook.yml -c local"

We first run a shell command, which registers the image_check fact. Then we can check the result code to select our base image. Finally, we start our container with either the ansible/ubuntu14.04-ansible:latest or built-by-ansible:ex3 depending on the existence of the latter image. This way the first will still be slow, but later builds will be much faster because most tasks won't need to change anything.

Running [example3b.yml] with the -v (verbose) option, ansible-playbook -v -i hosts example3b.yml, shows that no modules detected changes on the second and later runs.

PLAY RECAP ********************************************************************
localhost                  : ok=5    changed=0    unreachable=0    failed=0   ", "warnings": []

While this is a great speedup, it creates an extra layer each time you rebuild the image. This is problematic because an image can only have a maximum of 127 layers. You can use docker history built-by-ansible:ex3 to view the layers that compose our image.

Should you use ansible to build your containers?

Now that we've seen three different ways to build docker containers with ansible, which one should we use? If you ask Michael DeHaan, the creator of Ansible, probably none of them, this may be a job better left to your CI (continuous integration) server.

Why is that?

Containers typically have a single responsibility - a single app to run. This means that there is much less configuration required to build a container for an app than to build a virtual machine image for an app. Tasks such as like log management, process scheduling, and firewall rules are the responsibility of the host machine or the container runtime platform, not the container itself. This simplifies things a lot for the application developer.

In light of point #1, ansible adds an additional layer of complexity over cd some_dir && docker build ... that may have more costs than gains. Adding a docker build to your CI pipeline may be the path of least resistance.

Can you still use ansible and docker together?

Ansible is still great for gluing all of the disparate pieces of a system together. You can still use ansible to provision your docker hosts and set up dokku or mesosphere (depending on your scale and style). But if you're just using ansible to build docker images, it may be overkill.