At Sumo Logic, most backend code is written in Scala. Scala is a newer JVM (Java Virtual Machine) language created in 2001 by Martin Odersky, who also co-founded our Greylock sister company, TypeSafe. Over the past two years at Sumo Logic, we’ve found Scala to be a great way to use the AWS SDK for Java. In this post, I’ll explain some use cases.
1. Tags as fields on AWS model objects
Accessing AWS resource tags can be tedious in Java. For example, to get the value of the “Cluster” tag on a given instance, something like this is usually needed:
String deployment = null;
for (Tag tag : instance.getTags()) {
if (tag.getKey().equals(“Cluster”)) {
deployment = tag.getValue();
}
}
While this isn’t horrible, it certainly doesn’t make code easy to read. Of course, one could turn this into a utility method to improve readability. The set of tags used by an application is usually known and small in number. For this reason, we found it useful to expose tags with an implicit wrapper around the EC2 SDK’s Instance, Volume, etc. classes. With a little Scala magic, the above code can now be written as:
val deployment = instance.cluster
Here is what it takes to make this magic work:
object RichAmazonEC2 {
implicit def wrapInstance(i: Instance) = new RichEC2Instance(i)
}
class RichEC2Instance(instance: Instance) {
private def getTagValue(tag: String): String =
tags.find(_.getKey == tag).map(_.getValue).getOrElse(null)
def cluster = getTagValue(“Cluster”)
}
Whenever this functionality is desired, one just has to import RichAmazonEC2._
2. Work with lists of resources
Scala 2.8.0 included a very powerful new set of collections libraries, which are very useful when manipulating lists of AWS resources. Since the AWS SDK uses Java collections, to make this work, one needs to import collections.JavaConversions._, which transparently “converts” (wraps implicitly) the Java collections. Here are a few examples to showcase why this is powerful:
Printing a sorted list of instances, by name:
ec2.describeInstances(). // Get list of instances.
getReservations.
map(_.getInstances).
flatten. // Translate reservations to instances.
sortBy(_.sortName). // Sort the list.
map(i => “%-25s (%s)”.format(i.name, i.getInstanceId)). // Create String.
foreach(println(_)) // Print the string.
Grouping a list of instances in a deployment by cluster (returns a Map from cluster name to list of instances in the cluster):
ec2.describeInstances(). // Get list of instances.
filter(_.deployment = “prod”). // Filter the list to prod deployment.
groupBy(_.cluster) // Group by the cluster.
You get the idea – this makes it trivial to build very rich interactions with EC2 resources.
3. Add pagination logic to the AWS SDK
When we first started using AWS, we had a utility class to provide some commonly repeated functionality, such as pagination for S3 buckets and retry logic for calls. Instead of embedding functionality in a separate utility class, implicits allow you to pretend that the functionality you want exists in the AWS SDK. Here is an example that extends the AmazonS3 class to allow listing all objects in a bucket:
object RichAmazonS3 {
implicit def wrapAmazonS3(s3: AmazonS3) = new RichAmazonS3(s3)
}
class RichAmazonS3(s3: AmazonS3) {
def listAllObjects(bucket: String, cadence: Int = 100): Seq[S3ObjectSummary] = {
var result = List[S3ObjectSummary]()
def addObjects(objects: ObjectListing) = result ++= objects.getObjectSummaries
var objects = s3.listObjects(new ListObjectsRequest().withMaxKeys(cadence).withBucketName(bucket))
addObjects(objects)
while (objects.isTruncated) {
objects = s3.listNextBatchOfObjects(objects)
addObjects(objects)
}
result
}
}
To use this:
val objects = s3.listAllObjects(“mybucket”)
There is, of course a risk of running out of memory, given a large enough number of object summaries, but in many use cases, this is not a big concern.
Summary
Scala enables programmers to implement expressive, rich interactions with AWS and greatly improves readability and developer productivity when using the AWS SDK. It’s been an essential tool to help us succeed with AWS.