It's been over a year since I ditched Chef for Ansible as my config management tool of choice. Now I am really interested in Docker. Docker provides a simple abstraction for working with containers, but working with docker involves running a lot of commands manually. Maybe ansible can help us automate working with docker.
In this post, I will show you three different ways to build docker containers with ansible. We will see some of the tradeoffs involved with each approach and discover whether or not ansible is a good tool for building docker images.
How to follow along
If you want to follow along, you should clone the example repo and use the provided vagrant box. You also need to have the latest version of ansible installed. I am using ansible 1.9.1 and docker 1.6.2 in this post.
git clone [email protected]:jbgo/ansible-build-docker-images.git
cd ansible-build-docker-images
vagrant up
I do not recommend using docker machine because it uses boot2docker, which does not come with python installed. Ansible requires a working python installation on the managed hosts. It is typically safe to assume that a linux distro comes with python, but it is now popular to run minimalistic distros for the docker host.
A simple nginx site
We are going to use three different techniques to build a docker container for a static site. The build process involves installing nginx, replacing the nginx configuration file with our own, and copying the index.html file to the appropriate directory. When running, the site looks like this.
I will refer to the files required to build the site as the site files. All of the site files are in the site directory.
Building with the docker_image module (deprecated)
My first instict is to use the docker_image
module since ansible includes it. The module docs contain a deprecation warning, but I want to try it anyways.
Let's start by uploading the site directory to the docker host. This is the first task in example1.yml.
- name: upload the site directory to the docker host
synchronize: src=site dest=/tmp
This directory contains my nginx configuration, an index.html file, and a Dockerfile describing how to build the image.
FROM ubuntu:latest
RUN apt-get install -y nginx
ADD nginx.conf /etc/nginx/nginx.conf
RUN mkdir -p /var/www/site
ADD index.html /var/www/site/index.html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
With the site directory on the docker host, we can use the docker_image
module to build it.
- name: build the image
docker_image: >
name=built-by-ansible
tag=ex1
path=/tmp/site
state=present
We can use the docker
module, which is fully supported by ansible, to run a container from the new image, built-by-ansible:ex1
.
- name: run the site in a docker container
docker:
name: site1
image: "built-by-ansible:ex1"
state: reloaded
publish_all_ports: yes
We name the running container site1
and use the publish_all_ports
option to make any exposed container ports (port 80 for our site container) available via a random host port.
It's time to run [example1.yml].
ansible-playbook -i hosts example1.yml
Using the docker ps
command, we can see which host port docker chose.
$ vagrant ssh -c 'docker ps'
CONTAINER ID IMAGE COMMAND PORTS NAMES
dc2f2b4b0874 built-by-ansible:ex1 "nginx -g 'daemon of 0.0.0.0:32768->80/tcp site1
You can visit http://192.168.33.100:32768/
in your browser to view the site. If you are following along, you will likely have a different port.
One disadvantage to using the docker_image
module is that it does not rebuild the image when the site files change. For example, if we were to change the Dockerfile or the nginx config file, we would need to rebuild the image. Instead, ansible will only rebuild the image when the module name or tag options change.
We could try to work around this by rebuilding the image whenever the synchronize
module indicates a change. All we need to do is register the result of the sync task by adding register: sync_result
, then conditionally run the build task by adding when: sync_result.changed
.
- name: upload the site directory to the docker host
synchronize: src=site dest=/tmp
register: sync_result
- name: build the image
docker_image: >
name=built-by-ansible
tag=ex1
path=/tmp/site
state=present
when: sync_result.changed
Now our image gets rebuilt when we change site files, but the build task gets skipped entirely if we change nothing. If we change the image name or tag but not the site files, our image will not be rebuilt. That's not what we want!
One final reason not to use the docker_image
module, besides being deprecated, is that it does not support communicating with docker over https.
Building with the command module
At this point, I've hopefully convinced you not to use the docker_image
module. It may be better to run the docker build
command directly. Typically you want to use an ansible module whenever one is available because modules are idempotent. Modules will only run if the actual and desired states are different. However, the docker build command has it's own built-in caching mechanism to help speed up subsequent builds by only rebuilding image layers that require a change, so using an idempotent module is redundant.
Let's replace the docker_image
module with the command
module.
- name: build the image
command: docker build -t built-by-ansible:ex2 /tmp/site
That's it. We lose idempotency, but we ensure that are image is always built correctly. Time to run example2.yml
ansible-playbook -i hosts example2.yml
Again, you can use vagrant ssh -c 'docker ps'
to find the host port for the container and view it in your browser. Notice that we've tagged this image :ex2
and named the container site2
.
Another option available to you when using the command module is that it allows you to run your tasks locally and communicate with a remote docker host over https. For example, if you install docker machine, you can skip the synchronize task and run the tasks on localhost.
---
- name: Build an image with the command module for a remote docker host
hosts: localhost
connection: local
tasks:
- name: build the image
command: docker build -t built-by-ansible:ex2b ./site
- name: run the site in a docker container
docker:
name: site2b
image: "built-by-ansible:ex2b"
state: reloaded
publish_all_ports: yes
use_tls: encrypt
To get the docker module to work with docker machine, I had to use use_tls: encrypt
instead of use_tls: verify
(verify is more secure). I plan to revisit the docker module soon to understand why I had to do that and if there is a way to use verify. If you know why, please let me know in the comments.
To run this playbook, you will need to make sure you can communicate with your docker machine. I'm using the dev machine created during the default installation.
eval $(docker-machine env dev)
This sets up some environment variables to tell your local docker client how to communicate with the remote docker host. Once that's done, you can run example2b.yml.
ansible-playbook -i hosts example2b.yml
That's it for docker machine today. The rest of the playbooks in this post will be run on our vagrant box.
Running a playbook inside a container
So far, our approach involves invoking docker's build system from ansible. The Dockerfile approach is simple and works well, but it lacks a many of the features ansible provides. And if you already have ansible roles and playbooks for building your app, you might prefer to skip the process of translating those into dockerfiles.
Ansible provides a docker image with ansible already installed, so let's try it. The example provided by ansible still involves the use of a dockerfile. This may be a good idea, but for now I don't want to use a dockerfile at all. To accomplish this requires a 2-step build process.
- Use
docker run
to run ansible in a container. - Use
docker commit
to create an image from the container in step 1.
Additionally, I would like to remove the container I used to build my site because I don't need it anymore. But if the build fails, I would like to keep the container so I can debug the failure. Because docker doesn't allow duplicate container names, I will use a timestamp to name the container I run ansible in.
- name: create a unique temp container name
set_fact:
temp_container_name: ex3_build_{{lookup('pipe', 'date "+%Y%m%d%H%M%S"')}}
I'm using the set_fact
module and with a lookup to create the temp_container_name
fact. I can now use this fact in later tasks. I originally tried to create this as a variable, not as a fact, but surprisingly ansible would re-evaluate the lookup at the time of each task, so the timestamp changed from task to task.
Next I run ansible in a docker container.
- name: build site by running ansible in a docker container
command: "docker run
-v /tmp/site:/site
-w /site
--name={{temp_container_name}}
ansible/ubuntu14.04-ansible:latest
ansible-playbook playbook.yml -c local"
Let's breakdown those options. (see also: docker help run
).
docker run
run a command in a new container-v /tmp/site:/site
mount /tmp/site from the docker host as /site in the container. This is how we can access our site files from within the container.-w /site
set /site as the working directory in our container. The playbook we run inside the container is in this directory.--name=
name our container so we can reference it in later tasksansible/ubuntu14.04-ansible:latest
the official ansible base imageansible-playbook playbook.yml -c local
the command to run
Here are the contents of playbook.yml
. We run this playbook inside the docker container to build our site. There is nothing special about this playbook, but it's important that we run it for localhost with a local connection.
---
- name: configure the container with ansible
hosts: localhost
tasks:
- name: install nginx
apt: pkg=nginx state=installed
- name: create the site directory
file: dest=/var/www/site state=directory recurse=yes
- name: copy the site directory in place
synchronize: src=/site dest=/var/www
- name: add the nginx configuration
template: src=/site/nginx.conf dest=/etc/nginx/nginx.conf
If the task fails, we can use docker ps -a
to find our container for troubleshooting. Assuming the task succeeds, we can create an image from our container.
- name: create a docker image from the container
command: "docker commit
-c 'EXPOSE 80'
-c 'CMD [\"nginx\", \"-g\", \"daemon off;\"]'
{{temp_container_name}}
built-by-ansible:ex3"
Again, let's break this command down line-by-line.
docker commit
create a new image including the changes in the container's read-write layer-c 'EXPOSE 80'
containers will listen on port 80-c 'CMD [\"nginx\", \"-g\", \"daemon off;\"]'
run nginx as the default command- ``
the name of the container we ran ansible in built-by-ansible:ex3
the name and tag of our image
At this point we've successfully built our image, but to keep things tidy, let's delete the container we used for running ansible. We don't need this container anymore because we will start a new one based on our image instead.
- name: delete the container once the image has been successfully built
command: docker rm -f -v {{temp_container_name}}
Now we are ready to run example3.yml.
ansible-playbook -i hosts example3.yml
This example is more complex than building with a dockerfile. However, it may be worth it if your application has a unique and complicated setup process and you already have the ansible roles and playbooks written. I suspect that as it becomes more common to run apps in containers, we will see fewer special snowflake applications and more apps standardize around common stack-specific base images.
There are a couple of drawbacks to this approach (besides the additional complexity).
Despite building the exact same site, the dependencies required to run ansible in a container doubles the image size. You can see below that :ex3
is 482 MB while :ex1
and :ex2
are only 206 MB.
vagrant@docker:~$ docker images | grep ex
built-by-ansible ex3 c23ebd13b101 12 minutes ago 482.9 MB
built-by-ansible ex2 6394d2059085 32 minutes ago 206.4 MB
built-by-ansible ex1 810b757f08f5 2 hours ago 206.4 MB
Also, we build the entire image from scratch on every build, which is not very efficient. We can improve by checking for an existing image and running ansible in a container started from that image instead. Here are the changes we need to make:
- name: check for existence of our base image
shell: "docker images | grep built-by-ansible | grep ex3"
ignore_errors: yes
register: image_check
- set_fact:
base_image: "{{'built-by-ansible:ex3' if image_check.rc == 0 else 'ansible/ubuntu14.04-ansible:latest'}}"
- name: build site by running ansible in a docker container
command: "docker run -v /tmp/site:/site -w /site --name={{temp_container_name}}
{{base_image}} ansible-playbook playbook.yml -c local"
We first run a shell command, which registers the image_check
fact. Then we can check the result code to select our base image. Finally, we start our container with either the ansible/ubuntu14.04-ansible:latest
or built-by-ansible:ex3
depending on the existence of the latter image. This way the first will still be slow, but later builds will be much faster because most tasks won't need to change anything.
Running [example3b.yml] with the -v
(verbose) option, ansible-playbook -v -i hosts example3b.yml
, shows that no modules detected changes on the second and later runs.
PLAY RECAP ********************************************************************
localhost : ok=5 changed=0 unreachable=0 failed=0 ", "warnings": []
While this is a great speedup, it creates an extra layer each time you rebuild the image. This is problematic because an image can only have a maximum of 127 layers. You can use docker history built-by-ansible:ex3
to view the layers that compose our image.
Should you use ansible to build your containers?
Now that we've seen three different ways to build docker containers with ansible, which one should we use? If you ask Michael DeHaan, the creator of Ansible, probably none of them, this may be a job better left to your CI (continuous integration) server.
Why is that?
Containers typically have a single responsibility - a single app to run. This means that there is much less configuration required to build a container for an app than to build a virtual machine image for an app. Tasks such as like log management, process scheduling, and firewall rules are the responsibility of the host machine or the container runtime platform, not the container itself. This simplifies things a lot for the application developer.
In light of point #1, ansible adds an additional layer of complexity over cd some_dir && docker build ...
that may have more costs than gains. Adding a docker build to your CI pipeline may be the path of least resistance.
Can you still use ansible and docker together?
Ansible is still great for gluing all of the disparate pieces of a system together. You can still use ansible to provision your docker hosts and set up dokku or mesosphere (depending on your scale and style). But if you're just using ansible to build docker images, it may be overkill.