In today’s cloud-native world, systems are usually accessed by users from multiple devices and in various geographic locations. Anyone who has tried to operationalize an impossible travel type alert for cloud resources will understand the myriad nuances and gotchas involved in such an endeavor.
A user may be accessing a cloud resource from a mobile device that is tied to a carrier network well away from their normal geographic location. Likewise, users may be accessing cloud resources from endpoints that are also located in the cloud and are geographically dispersed. Indeed, the cloud makes the impossible very much possible within the context of an impossible travel alert.
In addition to these dynamics, token, cookie or other forms of credential theft such as phishing are all techniques that threat actors use to gain unauthorized access to cloud resources.
The threat labs team has authored blog posts on how to protect against cloud credential theft on both Windows and Linux endpoints. However, we need to take this dynamic further - or, better put, upwards - and look at how cloud telemetry can aid in the detection of anomalous user sessions.
What are session anomalies?
According to MITRE - the definition of a session is a temporary and interactive information interchange between two or more devices communicating over a network.
If we look at this definition through a cloud lens, we can visualize a very simplified normal session flow like this:
A user utilizes a web browser, mobile device or desktop/laptop computer to access various cloud resources. This access can then be authenticated through various means, either through something like single-sign on, a cookie for an existing session, or a token/key of some kind.
Now let us consider the following scenario: our happy user receives a convincing phishing email and enters their credentials into an attacker-controlled site through an adversary-in-the-middle technique - now the threat actor has valid credentials for cloud resources and our new visualization looks something like:
We now have the dynamic of an anomalous session coming into focus as there is now one invalid and unauthorized user accessing our cloud resources, in addition to a valid and authorized user accessing these same resources.
How do we tackle the problem of hunting or alerting on this activity in our environments?
Session anomaly hunting
Now that we know what a normal session looks like versus an abnormal or perhaps malicious session, we can start to examine some dynamics that present themselves when anomalous sessions are used for accessing cloud resources.
To build on this, let’s lay out a few hypotheses for what anomalous sessions might look like in our cloud environments, we will be looking at specific cloud telemetry to test each of these hypotheses:
A stolen session will potentially utilize multiple IP addresses / ASNs associated with the same username
A stolen session will result in potentially multiple User Agents in use by a single username
A stolen session might occur from an IP address not previously seen in the environment
A stolen session might result in multiple geographic locations in use by a single user
A stolen session might result in logins from geographic locations not previously seen in the environment
A stolen session might result in user actions not typically performed in a normal fashion by the user from which the session was stolen from
With these hypotheses in mind, let’s take a look at some cloud telemetry and look at some queries and detection strategies.
Entra ID
To begin testing some of our above hypotheses, we can use Azure Active Directory / Entra ID telemetry.
We can start with a query that looks at normalized and enriched Cloud SIEM data:
_index=sec_record_authentication | <strong>where</strong> metadata_deviceEventId = "SignInLogs" | <strong>where</strong> errorCode = "0" | <strong>timeslice</strong> 1d | <strong>count_distinct</strong>(device_ip_asnOrg) as distinct_ASNs,values(device_ip_asnOrg) as ASN_value by user_username,_timeslice
In this query, we are:
Looking at successful authentications from Entra ID SignInLogs
Timeslicing the data into 1-day intervals
Counting the number of distinct ASNs in use by a particular user, grouped by our timeslice
Displaying the ASN values, also grouped by the user and timeslice
Looking at our results, we see an interesting pattern emerge:
We can see the events highlighted in green have 1 distinct ASN in use per user. However, when we look at the rows highlighted in red, we see more than 1 ASN in use by a user within a day timeslice.
This doesn’t necessarily mean that a session was stolen, but it does give us a thread to pull on. Let’s add a geographic element to this query for a more comprehensive view:
_index=sec_record_authentication | <strong>where</strong> metadata_deviceEventId = "SignInLogs" | <strong>where</strong> errorCode = "0" | <strong>timeslice</strong> 1d | <strong>count_distinct</strong>(device_ip_asnOrg) as distinct_ASNs,count_distinct(device_ip_countryName) as country_count,values(device_ip_asnOrg) as ASN_value,values(device_ip_countryName) as auth_country by user_username,_timeslice | <strong>where</strong> distinct_ASNs > 1 AND country_count > 2
This query is adding a few elements to the query that we used above, namely:
Counting the number of distinct countries in use by a user in a particular timeslice
Adding a filter at the end to return results when multiple ASNs are in use AND more than two countries are seen
And looking at the results, we can see that something suspicious is occurring with our admin account, as it would be abnormal to have a user accessing your Entra resources from such a wide variety of geographic locations all in one day:
Another approach to the above detection dynamics is a Cloud SIEM Aggregation rule.
We can craft an aggregation rule that looks for a user utilizing more than a certain number of ASNs in a certain time period:
In addition to applying this detection approach to ASN values, we can also utilize the Operating System and User Agent information found within Azure/Entra ID telemetry:
And for User Agents, the logic will look very similar, but with a different count:
These approaches are all great, but the keen-eyed among you might have noticed that we are using static values throughout. That is, we have a set value for the number of User Agents, Operating Systems, ASN values etc that we know will trigger a Cloud SIEM Signal.
What if we did not want to set a static value for these thresholds, but wanted to baseline this type of activity and raise a signal when something occurs outside of this baseline?
That is the exact use case for Cloud SIEM Outlier Rules. When using Outlier Rules, we no longer need to define a static value for thresholds, as the system will baseline this activity for us. An outlier rule that looks for a higher than usual number of ASN values in use by a user would look something like:
This rule will baseline Azure signin activity and will build an hourly baseline of distinct counts of ASN values in use, per user. The rule will trigger when the number of ASNs in use by a user goes over the baseline value. It should be noted that all this detection value is provided through a query that is only two lines long, with the rest of the model parameters set in an easy-to-use graphical user interface.
Our alert for the outlier rule will look something like:
As is typical with these types of alerts, an analyst will most likely want to compare current activity with historical activity; as can be seen in the animation above, this activity can be performed by simply sliding the scale up or down within an outlier rule signal display - most impressive!
The UEBA party does not stop here, and we can also use First Seen rules to flag on successful authentications to our cloud environments from ASNs not previously associated with the user:
AWS Elastic Kubernetes Service
We can also apply the detection and hunting approaches outlined thus far to AWS Elastic Kubernetes Service (EKS) to detect or hunt for suspicious or malicious access to our EKS workloads.
Consider the following scenario: an AWS access key and secret are stolen by a threat actor; the stolen credentials in question have access to an AWS EKS cluster and the threat actor uses these credentials to access the cluster and resources contained within it.
We can look at the following query to help detect this activity:
_collector="tr-eks-cloudwatch" | <strong>json</strong> field=_raw "message.userAgent" as user_agent | <strong>json</strong> "message.user.extra.arn[0]" as arn | <strong>json</strong> "message.sourceIPs[0]" as src_ip | <strong>json</strong> field=_raw "message.user.extra.accessKeyId[0]" as key_id | <strong>isPublicIP</strong>(src_ip) as isPublic | <strong>where</strong> isPublic | <strong>values</strong>(src_ip) as src_ip by arn,key_id,user_agent
In this query, we are:
Parsing out the user agent, AWS ARN, AWS Key ID and source IP fields
Displaying results only if the source IP accessing our cluster is a public IP
Displaying the corresponding source IPs sorted by the ARN, key id and user agent fields
Looking at the results, we can see that in this case, the same ARN / key and user agent combination is being used to access our EKS cluster, but from two distinct IP addresses:
Kubernetes workloads present interesting challenges to defenders as there are many layers of telemetry involved.
In our example above, the session anomaly occurred due to stolen AWS keys. However, Kubernetes credentials can also be found on endpoints. A threat actor may gain access to an endpoint with Kubectl already configured and authenticated and may proceed to utilize this authenticated session to perform their goals and objectives.
Adding to this challenge is the fact that the Kubectl binary itself can be executed from either a Linux, Windows or macOS device. Anyone who has tried to wrangle process execution events from multiple operating systems and multiple telemetry sources will understand that this is not an easy task. Not only do defenders need to normalize the data to ensure broad coverage, but they also need to baseline telemetry around kubectl execution on endpoints in order to flag on deviations from this established baseline.
The challenges of “clean data” as well as baselining are aided by Cloud SIEM’s normalization features and first seen rules.
The rule itself will look like:
And the resulting Cloud SIEM signal will look like:
We can see that in this case, a user ran a kubectl command line of: “kubectl config get-contexts” which was not seen since the baseline period.
In this particular Signal, the telemetry used stemmed from Jamf. However, regardless of the source, be it Linux telemetry via Laurel and Auditd, Jamf, Windows 4688, Sysmon EID 1, or process telemetry originating from an EDR product, the detection logic outlined above will cover all these various telemetry sources.
Okta
Our threat hunting and detection hypotheses outlined earlier can also be applied to Okta telemetry. Let’s take a look at a few examples.
In our first example, we’ll be looking at a user accessing applications that are behind Okta single sign on (SSO) using multiple user agents within a certain time period.
In Cloud SIEM, this rule logic will look like this:
Because every network is different, with different authentication patterns and different developer workflows, it is a good idea to baseline your data prior to crafting operational alerts.
We can look at a query like the one below to get an idea of how many User Agents are in use by a particular user within a particular time slice; we can also compare this to historical usage in order to find anomalous patterns:
_index=sec_record_authentication | where metadata_vendor == "Okta" and description == "User single sign on to app" | %"fields.client.userAgent.rawUserAgent" as userAgent | where userAgent != "unknown" | timeslice 1h | count_distinct(userAgent) as dc_userAgents,values(userAgent) as userAgents by user_username, _timeslice | compare with timeshift 1w | where dc_userAgents > 2 | sort by dc_userAgents desc
And looking at our results, we can start to see some interesting patterns emerge:
We see that, within an hour time frame for our query, a user has used six distinct User Agents.
We can then compare this current usage to historical usage by utilizing the compare operator. We can see that in this case, this particular user only authenticated with one User Agent last week, but is authenticating with six distinct User Agents this week. This dynamic is worth investigating deeper, and this baselining effort allows analysts to craft more operational alerts as a result of a proactive threat hunt.
In addition to authentication-level anomalies, we can also look for suspicious access patterns that occur through Okta.
A good example here is a user accessing an application behind Okta SSO not previously seen in the baseline period. In Cloud SIEM, this rule logic will look like:
When such alerts or Signals are received, analysts will need to pivot off user name, application, IP address and other values in order to gain more information to confirm whether the activity is expected or malicious.
One way that we can perform deeper investigations into suspicious Okta events is by looking at whether a particular MFA request occurred from a different geographic location than the access of Okta applications behind SSO. That is, a user may be located in one country and may accept an MFA prompt from a threat actor holding valid credentials who is located in a different country. Luckily for us defenders, Okta captures both these events with the description of “Authentication of user via MFA” - we can use this telemetry with the following query:
_index=sec_record_authentication | <strong>where</strong> metadata_vendor = "Okta" | <strong>where</strong> description = "Authentication of user via MFA" | %"fields.target.2.detailentry.methodtypeused" <strong>as</strong> method_used | timeslice 1h | values(method_used) <strong>as</strong> mfa_method,values(device_ip_countryName) <strong>as</strong> country,values(device_ip_asnOrg) <strong>as</strong> ASNs,count_distinct(device_ip_countryName) <strong>as</strong> country_count <strong>by</strong> user_username,_timeslice,description | <strong>where</strong> country_count > 2 | sort <strong>by</strong> countr_count desc
Looking at the results, we see our user performing multi factor authentication from different countries:
The “Get a push notification” event corresponds to a user accepting a push notification and the password authentication occurs on the browser end - if we dig into and expand these events, we would see that the push acceptance occurred in Canada, but the password authentication via browser occurred in a different country.
To make things easier for analysts, we can use a parameterized query:
_index=sec_record_authentication | <strong>where</strong> metadata_vendor == "Okta" | <strong>where</strong> user_username = {{username}} | <strong>where</strong> description = "Authentication of user via MFA" | %"fields.target.2.detailentry.methodtypeused" <strong>as</strong> method_used | timeslice 1h | values(method_used) <strong>as</strong> mfa_method,values(device_ip_countryName) <strong>as</strong> country,values(device_ip_asnOrg) <strong>as</strong> ASNs,count_distinct(device_ip_countryName) <strong>as</strong> country_count <strong>by</strong> user_username,_timeslice,description | <strong>where</strong> country_count > 2 | sort <strong>by</strong> countr_count desc
Which will generate an input box to make editing the query much less cumbersome:
Amazon Web Services (AWS)
We’ve been making our way through some of the hunting hypotheses outlined at the start of this blog, we can wrap up by looking at how these techniques can be applied to an AWS environment as well.
Let’s imagine a scenario where an analyst is looking at the following Cloud SIEM signal:
This Signal conveys that someone on a Linux host ran the command: “cat .aws/credentials” in order to access the AWS CLI credential file on this host.
As a next step, the analyst must figure out if this action was malicious or intended. The analyst can look at host-level artifacts and work their way up and down the process chain to determine if this access of cloud credential material was done through planned work or malicious activity. In our fictitious scenario, the analyst determines that this cloud credential access was at least suspicious on the host level. Now, the analyst must turn to CloudTrail telemetry to gain a deeper understanding of how these potentially stolen keys were used.
We can look at the following query:
_index=sec_record_audit | where metadata_product = "CloudTrail" | %"fields.userIdentity.accessKeyId" <strong>as</strong> accesskeyId | <strong>where</strong> !isBlank(accesskeyId) <strong>and</strong> !isBlank(user_username) | timeslice 10m | count_distinct(http_userAgent) <strong>as</strong> ua_count,values(http_userAgent) <strong>as</strong> user_agents,count_distinct(device_ip_asnOrg) <strong>as</strong> ASN_Count,values(device_ip_asnOrg) <strong>as</strong> ASNs,values(action) <strong>as</strong> APICalls <strong>by</strong> _timeslice,accesskeyId,user_username | <strong>where</strong> ua_count > 1 <strong>AND</strong> ASN_Count > 1
In this query we are looking at CloudTrail telemetry and are timeslicing our data and looking for a single access key id and username using multiple user agents and multiple ASN values to connect to our AWS infrastructure.
Looking at our results, we see an interesting dynamic emerge:
In this case, we can see that the potentially stolen access key was used from two different ASNs as well as through two different versions of the AWS CLI. One of the ASN values belongs to the AWS IP space and the analyst can consider that less suspicious, as the workstation from which the AWS keys were potentially stolen from was an EC2 instance running in the AWS cloud. However, the other ASN value is new and unexpected. At this point, the analyst can increase their confidence levels that credentials were indeed stolen from the host and a deeper investigation is warranted.
Another way to approach this kind of threat hunt is to build a “session profile” for a given user. In the case of CloudTrail telemetry, some good starting points for building such a profile are a combination of ASN, geolocation, user agent and user name. Once we have a session profile in one field, we can compare the various session profiles with each other to potentially discover anomalous or suspicious sessions.
In query form, this will look something like:
_index=sec_record_audit | <strong>where</strong> metadata_product = "CloudTrail" | <strong>where</strong> !isBlank(user_username) and !isBlank(device_ip_asnorg) <em>// Make an authentication profile consisting of ASN, Country, User Agent and Username</em> | concat(device_ip_asnorg,",",device_ip_countryName,",",http_userAgent,",",user_username) <strong>as</strong> session_profile <em>// Find the least common patterns in the authentication profile</em> | logreduce field=session_profile <strong>by</strong> user_username criteria=leastcommon | values(user_username) <strong>as</strong> username <strong>by</strong> _count,signature | <strong>where</strong> _count < 3
In this query, we are using the concat operator in order to build a field called “session_profile” we are then using the LogReduce operator to find the least common signatures among the various session_profile fields, and are displaying results when the count of signatures is less than three sorted by username. The idea here is to account for things like minor version changes of things like the AWS CLI that are found in the User Agent fields.
Looking at the results, we see that out of all the various session profile combinations, one is flagged, meaning that this particular session profile was not used as often as other session profiles to access this particular AWS tenant:
We can then pull on this thread a little bit and look at the session profiles in use by the user from the results returned above:
_index=sec_record_audit | <strong>where</strong> metadata_product = "CloudTrail" | <strong>where</strong> !isBlank(user_username) and !isBlank(device_ip_asnorg) | <strong>where</strong> user_username = "{{user_username}}" <em>// Make an authentication profile consisting of ASN, Country, User Agent and Username</em> | concat(device_ip_asnorg,",",device_ip_countryName,",",http_userAgent,",",user_username) <strong>as</strong> session_profile | values(session_profile) <strong>as</strong> session_profile <strong>by</strong> user_username
Looking at the results, we see three distinct session profiles in use:
Digging deeper into the data, we can see that the “Clouvider Limited” ASN, combined with access from the Netherlands raises suspicions and gives security teams more threads to pull on based on cloud session anomalies.
Conclusion
In this blog, we have highlighted some proactive threat-hunting hypotheses relating to cloud session anomalies resulting from credential theft. We have suggested a number of hunting, alerting and detection approaches and have shown how various Cloud SIEM features, including advanced alerting capabilities via First Seen and Outlier rules can be utilized in order to find this type of malicious activity in your networks.
To learn more about Sumo Logic Cloud SIEM, check out the product or click through an interactive demo.
Complete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.