Note: This post is now superceded by Update On Logging With Docker.
Learning By Listening, And Doing
Over the last couple of months, we have spent a lot of time learning about Docker, the distributed application delivery platform that is taking the world by storm. We have started looking into how we can best leverage Docker for our own service. And of course, we have spent a lot of time talking to our customers. We have so far learned a lot by listening to them describe how they deal with logging in a containerized environment.
We actually have already re-blogged how Caleb, one of our customers, is Adding Sumo Logic To A Dockerized App. Our very own Dwayne Hoover has written about Four Ways to Collect Docker Logs in Sumo Logic.
Along the way, it has become obvious that it makes sense for us to provide an “official” image for the Sumo Collector. Sumo Logic exposes an easy to use HTTP API, but the vast majority of our customers are leveraging our Collector software as a trusted, production-grade data collection conduit. We are and will continue to be excited about folks building their own images for their own custom purposes. Yet, the questions we get make it clear that we should release an official Sumo Logic Collector image for use in a containerized world
Instant Gratification, With Batteries Included
A common way to integrate logging with containers is to use Syslog. This has been discussed before in various places all over the internet. If you can direct all your logs to Syslog, we now have a Sumo Logic Syslog Collector image that will get you up and running immediately:
docker run -d -p 514:514 -p 514:514/udp --name="sumo-logic-collector"sumologic/collector:latest-syslog [Access ID] [Access key]
Started this way, the default Syslog port 514 is mapped port on the host. To test whether everything is working well, use telnet
on the host:
Then type some text, hit return, and then CTRL-] to close the connection, and enter quit to exittelnet
. After a few moments, what you type should show up in the Sumo Logic service. Use a search to find the message(s).
To test the UDP listener, on the host, use Netcat, along the lines of:
And again, the message should show up on the Sumo Logic end when searched for.
If you want to start a container that is configured to log to syslog and make it automatically latch on to the Collector container’s exposed port, use linking:
docker run -it --link sumo-logic-collector:sumo ubuntu /bin/bash
From within the container, you can then talk to the Collector listening on port 514 by using the environment variables populated by the linking:
echo "I'm in ur linx" | nc -v -u -w 0 $SUMO_PORT_514_TCP_ADDR $SUMO_PORT_514_TCP_PORT
That’s all there is to it. The image is available from Docker Hub. Setting up an Access ID/Access Key combination is described in our online help.
Composing Collector Images From Our Base Image
Following the instructions above will get you going quickly, but of course it can’t possibly cover all the various logging scenarios that we need to support. To that end, we actually started by first creating a base image. The Syslog image extends this base image. Your future images can easily extend this base image as well. Let’s take a look at what is actually going on! Here’s the Github repo:https://github.com/SumoLogic/sumologic-collector-docker.
One of the main things we set out to solve was to clarify how to allow creating an image that does not require customer credentials to be baked in. Having credentials in the image itself is obviously a bad idea! Putting them into the Dockerfile
is even worse. The trick is to leverage a not-so-well documented command line switch on the Collector executable to pass the Sumo Logic Access ID and Access Key combination to the Collector. Here’s the meat of the run.sh startup script referenced in the Dockerfile
:
The rest is really just grabbing the latest Collector Debian package and installing it on top of a base Ubuntu 14.04 system, invoking the start script, checking arguments, and so on.
As part of our continuous delivery pipeline, we are getting ready to update the Docker Hub-hosted image every time a new Collector is released. This will ensure that when you pull the image, the latest and greatest code is available.
How To Add The Batteries Yourself
The base image is intentionally kept very sparse and essentially ships with “batteries not included”. In itself, it will not lead to a working container. This is because the Sumo Logic Collector has a variety of ways to setup the actual log collection. It supports tailing files locally and remotely, as well as pulling Windows event logs locally and remotely.
Of course, it can also act as a Syslog sink. And, it can do any of this in any combination at the same time. Therefore, the Collector is either configured manually via the Sumo Logic UI, or (and this is almost always the better way), via a configuration file. The configuration file however is something that will change from use case to use case and from customer to customer. Baking it into a generic image simply makes no sense.
What we did instead is to provide a set of examples. This can be found in the same Github repository under “example”: https://github.com/SumoLogic/sumologic-collector-docker/tree/master/example. There’s a couple of sumo-source.json
example files illustrating, respectively, how to set up file collection, and how to setup Syslog UDP and Syslog TCP collection. The idea is to allow you to either take one of the example files verbatim, or as a starting point for your own sumo-sources.json
. Then, you can build a custom image using our image as a base image. To make this more concrete, create a new folder and put this Dockerfile
in there:
Then, put a sumo-sources.json
into the same folder, groomed to fit your use case. Then build the image and enjoy.
A Full Example
Using this approach, if you want to collect files from various containers, mount a directory on the host to the Sumo Logic Collector container. Then mount the same host directory to all the containers that use file logging. In each container, setup logging to log into a subdirectory of the mounted log directory. Finally, configure the Collector to just pull it all in.
The Sumo Logic Collector has for years been used across our customer base in production for pulling logs from files. More often than not, the Collector is pulling from a deep hierarchy of files on some NAS mount or equivalent. The Collector is quite adept and battle tested at dealing with file-based collection.
Let’s say the logs directory on the host is called /tmp/clogs
. Before setting up the source configuration accordingly, make a new directory for the files describing the image. Call it for example sumo-file
. Into this directory, put this Dockerfile
:
The Dockerfile
extends the base image, as discussed. Next to the Dockerfile
, in the same directory, there needs to be a file called sumo-sources.json
which contains the configuration:
With this in place, build the image, and run it:
<pre class="brush: plain; title: ; notranslate">docker run -d -v /tmp/clogs:/tmp/clogs -d --name="sumo-logic-collector"[image name] [your Access ID] [your Access key]</pre>Finally, add -v /tmp/clogs:/tmp/clogs
when running other containers that are configured to log to /tmp/clogs
in order for the Collector to pick up the files.
Just like the ready-to-go syslog image we described in the beginning, a canonical image for file collection is available. See the source: https://github.com/SumoLogic/sumologic-collector-docker/tree/master/file.
docker run -v /tmp/clogs:/tmp/clogs -d --name="sumo-logic-collector"sumologic/collector:latest-file [Access ID] [Access key]
If you want to learn more about using JSON to configure sources to collect logs with the Sumo Logic Collector, there is a help page with all the options spelled out.
That’s all for today. We have more coming. Watch this space. And yes, comments are very welcome.