blog に戻る

2015年05月02日 Christian Beedgen

Collecting In-Container Log Files

Collecting In-Container Log Files

Docker and the use of containers is spreading like wildfire. In a Docker-ized environment, certain legacy practices and approaches are being challenged. Centralized logging is the one of them. The most popular way of capturing logs coming from a container is to setup the containerized process such that it logs to stdout. Docker then spools this to disk, from where it can be collected. This is great for many use cases. We have of course blogged about this multiple times already. If the topic fascinates you, also checkout a presentation I did in December at the Docker NYC meetup.

At the same time, at Sumo Logic our customers are telling us that the stdout approach doesn’t always work. Not all containers are setup to follow the process-per-container model. This is sometimes referred to as “fat” containers. There are tons of opinions about whether this is the right thing to do or not. Pragmatically speaking, it is a reality for some users.

Even some programs that are otherwise easily containerized as single processes pose some challenges to the stdout model. For example, popular web servers write at least two log files: access and error logs. There are of course workarounds to map this back to a single stdout stream. But ultimately there’s only so much multiplexing that can be done before the demuxing operation becomes too painful.

A Powerstrip for Logfiles

Powerstrip-Logfiles presents a proof of concept towards easily centralizing log files from within a container. Simply setting LOGS=/var/logs/nginx in the container environment, for example, will use a bind mount to make the Nginx access and error logs available on the host under /var/logs/container-logfiles/containers/[ID of the Nginx container]/var/log/nginx. A file-based log collector can now simply be configured to recursively collect from /var/logs/container-logfiles/containers and will pick up logs from any container configured with the LOGS environment variable.

Powerstrip-Logfiles is based on the Powerstrip project by ClusterHQ, which is meant to provide a way to prototype extensions to Docker. Powerstrip is essentially a proxy for the Docker API. Prototypical extensions can hook Docker API calls and do whatever work they need to perform. The idea is to allow for extensions to Docker to be composable – for example, to add support for overlay networks such as Weave and for storage managers such as Flocker.

Steps to run Powerstrip-Logfiles

Given that the Powerstrip infrastructure is meant to support prototyping of what one day will hopefully become Docker extensions, there’s still a couple of steps required to get this to work.

First of all, you need to start a container that contains the powerstrip-logfiles logic:

$ docker run --privileged -it --rm \
 --name powerstrip-logfiles \
 --expose 80 -v /var/log/container-logfiles:/var/log/container-logfiles \
 -v /var/run/docker.sock:/var/run/docker.sock \
 raychaser/powerstrip-logfiles:latest \
 -v --root /var/log/container-logfiles

Next you need to create a Powerstrip configuration file…

$ mkdir -p ~/powerstrip-demo
$ cat > ~/powerstrip-demo/adapters.yml <<EOF
endpoints:
 "POST /*/containers/create":
 pre: [logfiles]
 post: [logfiles]
adapters:
 logfiles: http://logfiles/v1/extension
EOF

…and then you can start the powerstrip container that acts as the Docker API proxy:

$ docker run -d --name powerstrip \
 -v /var/run/docker.sock:/var/run/docker.sock \
 -v ~/powerstrip-demo/adapters.yml:/etc/powerstrip/adapters.yml \
 --link powerstrip-logfiles:logfiles \
 -p 2375:2375 \
 clusterhq/powerstrip

Now you can use the normal docker client to run containers.

First you must export the DOCKER_HOST variable to point at the powerstrip server:

$ export DOCKER_HOST=tcp://127.0.0.1:2375

Now you can specify as part of the container’s environment which paths are supposed to be considered logfile paths. Those paths will be bind-mounted to appear under the location of the –root specified when running the powerstrip-logfiles container.

$ docker run --cidfile=cid.txt --rm -e "LOGS=/x,/y" ubuntu \
 bash -c 'touch /x/foo; ls -la /x; touch /y/bar; ls -la /y'

You should now be able to see the files “foo” and “bar” under the path specified as the –root:

$ CID=$(cat cid.txt)
$ ls /var/log/container-logfiles/containers/$CID/x
$ ls /var/log/container-logfiles/containers/$CID/y

See the example in the next section on how to most easily hook up a Sumo Logic Collector.

Sending Access And Error Logs From An Nginx Container To Sumo Logix

For this example, you can just run Nginx from a toy image off of Docker Hub:

$ CID=$(DOCKER_HOST=localhost:2375 docker run -d --name nginx-example-powerstrip -p 80:80 -e LOGS=/var/log/nginx raychaser/powerstrip-logfiles:latest-nginx-example) && echo $CID

You should now be able to see the Nginx container’s /var under the host’s /var/log/container-logfiles/containers/$CID/:

$ ls -la /var/log/container-logfiles/containers/$CID/

And if you tail the access log from that location while hitting http://localhost you should see the hits being logged:

$ tail -F /var/log/container-logfiles/containers/$CID/var/log/nginx/access.log

Now all that’s left is to hook up a Sumo Logic collector to the /var/log/container-logfiles/containers/ directory, and all the logs will come to your Sumo Logic account:

$ docker run -v /var/log/container-logfiles:/var/log/container-logfiles -d \
 --name="sumo-logic-collector" sumologic/collector:latest-powerstrip [Access ID] [Access Key]

This collector is pre-configured to collect all files from /container-logfiles which by way of the -v volume mapping in the invocation above is mapped to /var/log/container-logs/containers, which is where powerstrip-logfiles by default writes the logs for the in-container files.

As a Sumo Logic user, it is very easy to generate the required access key by going to the Preferences page.

Once the collector is running, you can search for

_sourceCategory=collector-container

in the Sumo Logic UI and you should see the toy Nginx logs.

Simplify using Docker Compose

And just because we can, here’s how this could all work with Docker Compose. Docker Compose will allow us to write a single spec file that contains all the details on how the Powerstrip container, powerstrip-logfiles, and the Sumo Logic collector container are to be run. The spec is a simple YAML file:

powerstriplogfiles:
 image: raychaser/powerstrip-logfiles:latest
 ports: 
 - 80
 volumes:
 - /var/log/container-logfiles:/var/log/container-logfiles
 - /var/run/docker.sock:/var/run/docker.sock
 environment:
 ROOT: /var/log/container-logfiles
 VERBOSE: true
 entrypoint: 
 - node
 - index.js
powerstrip:
 image: clusterhq/powerstrip:latest
 ports:
 - "2375:2375"
 volumes:
 - /var/run/docker.sock:/var/run/docker.sock
 - ~/powerstrip-demo/adapters.yml:/etc/powerstrip/adapters.yml
 links:
 - "powerstriplogfiles: logfiles"
sumologiccollector:
 image: sumologic/collector:latest-powerstrip
 volumes:
 - "/var/log/container-logfiles:/var/log/container-logfiles"
 env_file: .env

You can copy and paste this into a file called docker-compose.yml, or take it from the powerstrip-logfiles Github repo.

Since the Sumo Logic Collector will require valid credentials to log into the service, we need to put those somewhere so Docker Compose can wire them into the container. This can be accomplished by putting them into the file .env in the same directory, something like so:

SUMO_ACCESS_ID=[Access ID]
SUMO_ACCESS_KEY=[Access Key]

This is not a great way to deal with credentials. Powerstrip in general is not production ready, so please keep in mind to try this only outside of a production setup, and make sure to delete the access ID and access key in the Sumo Logic UI.

Then simply run, in the same directory as docker-compose.yml, the following:

$ docker-compose up

This will start all three required containers and start streaming logs to Sumo Logic. Have fun!

Complete visibility for DevSecOps

Reduce downtime and move from reactive to proactive monitoring.

Sumo Logic cloud-native SaaS analytics

Build, run, and secure modern applications and cloud infrastructures.

Start free trial
Christian Beedgen

Christian Beedgen

As co-founder and CTO of Sumo Logic, Christian Beedgen brings 18 years experience creating industry-leading enterprise software products. Since 2010 he has been focused on building Sumo Logic’s multi-tenant, cloud-native machine data analytics platform which is widely used today by more than 1,600 customers and 50,000 users. Prior to Sumo Logic, Christian was an early engineer, engineering director and chief architect at ArcSight, contributing to ArcSight’s SIEM and log management solutions.

More posts by Christian Beedgen.

これを読んだ人も楽しんでいます