In November 2023, Sumo Logic experienced a security incident. While no one wants to be a victim of a cyberattack, and we certainly learned a lot about things that we can do better in the future, our team was lauded by customers and media alike for how we handled the situation underscoring the importance of a good incident response plan.
No [incident] is good news but look at how quickly and cleanly the response from their security team was orchestrated. It seems like customer-side data wasn't impacted but the suggestion to rotate keys is always a good one in these cases. In fact, a good step would be to invalidate/revoke all the API Keys they think could be impacted.
Jason Kent - Cequence Security Hacker in Residence
One of the core values at Sumo Logic is that we’re in it with our customers. But more broadly speaking, we’re in it with the InfoSec community. We want to ensure you can copy what we got right and learn from our mistakes.
Let’s go through a quick timeline of events so you get a feel for what happened, and then we can get clear on what worked well, what could be improved, and how to make sure you’re prepared for any security incident in your organization.
A bird's eye view of the Sumo Logic security incident
For obvious reasons, we won’t give too much detail into the specifics of the indicators of compromise (IoCs) or how we closed those windows, but here are the broad strokes.
In early November, a member of our development team was working on deprecating the use of static infrastructure credentials and replacing them with IAM roles. Looking through our CloudTrail logs in AWS, he noticed something unusual with our Trufflehog client agent and flagged the issue to the security operations center (SOC).
The SOC performed their triage investigation with custom searches through the logs, comparing the Trufflehog credentials and other known user IDs. Once it appeared that this was malicious activity, the SOC team kicked off its official incident response.
The first step was bifurcating communication. The SOC and AppSec teams were working closely in an operational Slack channel, sharing Sumo Logic queries as they zeroed in on the relevant logs while upwards of 100 people collaborated to rotate all credentials. Meanwhile, the leaders of those teams were keeping a second channel up to date for all executive communications. Here, the C-suite executives, corporate communications, finance, legal, and customer support teams could identify areas of need and make sure they could communicate clearly with all customers and the broader community.
You can follow the timeline of communications on our security response center.
Over the following days, we were able to narrow down the investigation and hone in on remediation and forensics. Using a “follow the sun” setup, engineers and analysts could pass their work from EMEA to North America to APJ, passing the baton with their Sumo Logic queries, determinations, and next steps for each time zone to continue working through.
Within days, we had rotated and deprecated credentials, narrowed the window of potential IoCs and informed our internal and external partners and customers, eventually determining that no data was exfiltrated and the threat was as close to over as possible (although no incident is ever really closed and it’s important for security professionals to continually monitor for IoCs). The security team used Sumo Logic Cloud SIEM to dig into forensics for the incident and tune the solution, integrating learnings to improve future potential incidents.
Lessons learned from the security incident at Sumo Logic
No one ever looks forward to a security incident, but they are inevitable in the modern age. That’s why it’s vital to find and fix the issue quickly, but also learn new best practices or reenforce good habits. Our incident taught us a range of lessons.
Having a single source of truth in logs pays dividends, especially going back so many years as required by some compliance regulations like HIPAA
You never know what logs you’ll need until you need them, so best to ingest everything
Incident response is a team sport. Security, engineering, marketing, finance, legal, the list goes on. Everyone is responsible during an incident in different ways, so everyone needs skin in the game and training.
Stay agile. The best practices of last year have already evolved, as have the attack vectors
Communication must be segmented during an incident. This allows practitioners to collaborate and work through the issue while keeping execs, communications pros, and legal experts in their own channels to avoid interruptions and deliver consistent communications to the right audiences.
Don’t assume automated tools are working as expected. You still need to monitor them. There’s no such thing as “set it and forget it,” especially with resource turnover.
Communicate early and often with customers. Don’t be an alarmist. Transparent, calm, and knowledgeable communication goes a long way.
Final thoughts
Organizations need to stay vigilant, especially with potential insider threats or automated activities that were initiated by people who are no longer with the company. The rate of security incidents is increasing, and our ability to quickly detect, get to the root of, and respond to an incident is more critical than ever. We all must be watchful, share information and contribute to the security community.
If you need to improve your security posture, stay on top of insider threats, or generally improve your security visibility, logs are a great place to start and having the right security tools in place can be transformational. Having been through it, we are prepared to be even stronger partners with you as we help you keep your applications and infrastructure secure.
Learn more about Sumo Logic’s security offerings.
Join our live session at RSAC to learn more about DevSecOps best practices.
Complete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.