blog に戻る

2024年07月11日 David Andrzejewski and Bashyam Anant

ROI for GenAI: Splunk to Sumo Logic Transformer

ROI for GenAI: Splunk to Sumo Logic Transformer

Tool consolidation outcomes have driven some customers to drop Splunk and consolidate their log analytics use cases on Sumo Logic. Long-term Splunk customers with many dashboards, saved searches and monitors understandably want to retain a consistent experience for end users. As a result, a replacement strategy requires migration. In our experience, Splunk customers with over 100 dashboards, saved searches or monitors are effectively locked into Splunk as the annualized cost of migration and SaaS subscriptions of a replacement log analytics platform are prohibitive.

Our goal with Splunk2Sumo Transformer was to create a professional services tool that could lower the cost of migration significantly to make it attractive for customers to migrate off of Splunk. Generative AI (Gen AI) is a key enabler of the Splunk2Sumo Transformer. In this article, we assess time (cost) savings associated with Gen AI in this context, using time savings data from a recent customer migration.

Early Splunk to Sumo Logic translation attempts

Splunk2Sumo Transformer originated from a Sumo Logic hackathon in December 2022. Because transformers were originally benchmarked on natural language translation tasks, we hypothesized that they might be well suited to log query language translation, such as from Splunk to Sumo Logic. Translation accuracy using pre-ChatGPT, state of the art “open weights” language models during the hackathon was poor.

Nonetheless, we tapped into a key insight: that these query languages have structure, and exploiting that structure was key to improving accuracy. Concretely, queries in Schema on Read log platforms like Splunk and Sumo Logic use the following four segment structure:

  • Source of the logs: the first line in a query identifies the source of the logs along with top-level keyword filters

  • Field extraction: using multiple parse statements

  • Analysis and aggregation: using search operators such as where, count, sum and so on

  • Visualization: through charts, tables and so on. This is part of Splunk logs query language but is not used in Sumo Logic logs queries.

An example Sumo Logic logs query for Apache web server logs, annotated with these segments, is below

//SEGMENT 1: source of logs with keyword filters
webengine.system=apache webengine.cluster.name=* HTTP (40* OR 41* OR 42* OR 43* OR 44* or 45* or 49*)

//SEGMENT 2: Field extraction

| json "log" nodrop | if  (_raw matches "{*", log, _raw)  as mesg
| parse regex field=mesg "^(?<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" nodrop
| parse regex field=mesg "(?<method>[A-Z]+)\s(?<url>\S+)\sHTTP\/[\d\.]+[\\n]*\"\s(?<status_code>\d+)\s(?<size>[\d-]+)" nodrop
| parse regex field=mesg "(?<method>[A-Z]+)\s(?<url>\S+)\sHTTP\/[\d\.]+[\\n]*\"\s(?<status_code>\d+)\s(?<size>[\d-]+)\s\"(?<referrer>.*?)\"\s\"(?<user_agent>.+?)\".*" nodrop

//SEGMENT 3: Analysis and aggregation

| where status_code matches "4*"
| count as count by src_ip
| sort count, src_ip asc
| limit  5

We found in our early experiments that breaking up a Splunk logs query into its four segments, translating the segments individually into Sumo Logic syntax and reassembling the translated segments into a Sumo Logic query could boost accuracy tremendously versus attempting to translate the entire query in a single operation.

Path to productization

Two breakthroughs drove Splunk2Sumo Transformer to reality in the summer of 2023:

  • Dramatic improvements in accuracy of Large Language Models

  • Ability to chain multiple LLM tasks, exemplified by LangChain.

This helped us instruct an AI (based on a popular Large Language Model) to exploit the structure of the translation task. Loosely, the first AI task would break up a Splunk query into four segments, the next AI task would translate the segments to Sumo Logic, followed by a final task that would reassemble translated segments into a fully-formed Sumo Logic query

With very limited datasets, we were able to demonstrate that translation accuracy would likely drive 70% reduction in professional services costs to migrate from Splunk to Sumo Logic in the best case scenario. Very early, we recognized that an overall professional services engagement involved activities that could not be automated. These include finalizing requirements, defining milestones and success criteria, project management, query optimization, quality assurance and user acceptance cycles of a fully functional replacement for Splunk.

Case study

Given the promise of Splunk2Sumo Transformer, our account teams signed up our first migration customer, a conglomerate that wanted to consolidate business units that used Splunk onto Sumo Logic.

Almost immediately, our professional services team noticed issues with Splunk2Sumo Transformer. The core of our approach was few-shot in-context learning, and we found that our limited examples created during tool development were inadequate for translating real world Splunk queries. To diagnose GenAI accuracy issues, Sumo Logic’s professional services lead, Bhargavi Ketha, established a query classification rubric for identifying simple, moderate and complex translations.

Simple :

  • One line queries with only keyword/scope filters

  • No pipe delimiters

  • No aggregations

  • <24h query time range

Moderate :

  • <5 pipe delimiters indicating more logic

  • Minimal number of keyword filters

  • <24h query time range

  • Simple aggregations

  • Supported Sumo Logic query operators

Complex :

  • >7 pipe delimiters indicating more logic

  • Large number of keyword filters and scope

  • Large number of aggregations

  • >24h time range that would require optimization by professional service

  • Splunk operators whose functionality did not map directly to a corresponding Sumo operator and would require professional services to understand requirements and come up with an alternative Sumo Logic query

Customer query complexity

The above image shows the mix of query complexity for the customer we were working with. Bhargavi noted that translation accuracy was high for simple queries, lower for moderate queries and least for complex queries. Besides additional in-context examples to handle log search operators and patterns we noted in the customer’s environment, Splunk2Sumo Transformer had to be enhanced with several pre-processing changes to better support the AI:

  • Removing references to earliest/latest and timeranges in Sumo Logic queries

  • Understanding Splunk escape character patterns

  • Fixing syntax-related errors

  • Process the source expression line with literal Splunk->Sumo metadata mapping

  • Handling keyword patterns (multiple OR's needing to be encapsulated with parentheses or multiple AND keyword conditions)

  • if nesting conditions correctly

  • Renaming fields to string values with spaces with the prefixing of field with %"

  • Fixing Splunk’s use of != with !(x=10)

We synthesized our own examples to ensure customer data privacy. As the project progressed, we enhanced Splunk2Sumo Transformer and measured time savings over three iterations of the tool that coincided with customer migration milestones, as shown below.

Without AI, simple queries take about five minutes of professional services time to translate, moderate require 5-15 minutes while complex queries require between one to three hours, including query authoring, validation and testing. Time savings with GenAI were 100% for simple query translations, 60-70% for moderate queries and about 15% for complex queries.

For complex queries, the tool makes a best effort translation and leaves annotations for the professional services consultant to review requirements and determine a translation approach. It is important to note that query mix is a key driver of cost savings. The first iteration of the tool added costs to professional services because of the overhead associated with poor translations, but subsequent iterations generated more savings given improvements in Splunk2Sumo Transformer, ending the final milestone with about 49% savings.

Savings v Tool Iteration Chart

Next steps

While the journey to GenAI for Splunk to Sumo Logic migrations was not straightforward, iterative refinement of Splunk2Sumo Transformer leads us to believe that we can achieve about 50% time savings in migration costs. If you are a Splunk customer, be sure to review how Sumo Logic compares - we would love to hear from you.

Complete visibility for DevSecOps

Reduce downtime and move from reactive to proactive monitoring.

Sumo Logic cloud-native SaaS analytics

Build, run, and secure modern applications and cloud infrastructures.

Start free trial

David Anant

David Andrzejewski and Bashyam Anant

Director, Engineering | Sr Director, Advanced Analytics

More posts by David Andrzejewski and Bashyam Anant.

More posts by David Anant.

これを読んだ人も楽しんでいます