--> --> --> -->

Sign In

...

Instead of just looking for known bad things, how can experts use KQL in Sentinel to find brand new, unknown threats by noticing unusual patterns in how people or computers act?

To find brand new, unknown threats using KQL in Sentinel, experts employ behavioral anomaly detection, which focuses on identifying deviations from established normal behavior patterns of entities like users, computers, or applications. Instead of searching for specific malicious signatures, this method builds a baseline of typical activity and then flags anything significantly unusual.

KQL, or Kusto Query Language, is the powerful query language used in Microsoft Sentinel for data exploration and analysis. Sentinel ingests vast amounts of log data, such as `SigninLogs` for user authentication, `SecurityEvent` for Windows events, and `AzureActivity` for Azure resource operations. For anomaly detection, this data must first be normalized, meaning it is parsed into a consistent schema, making it easier to query and analyze uniformly across different data sources.

Establishing a baseline is the crucial first step, defining what constitutes "normal" behavior for an entity. This involves using KQL to analyze historical data over a significant period. Time-series analysis is frequently used, where data is grouped into time windows using operators like `bin(TimeGenerated, 1h)` to aggregate events per hour. Statistical aggregations such as `count()`, `avg()`, `sum()`, `stdev()`, and `percentile()` are applied to fields like `Activity`, `SourceIP`, `DestinationIP`, `ResourceGroup`, or `AccountUPN` to understand typical volumes, frequencies, and unique values associated with an entity. For example, to baseline a user's typical login count, a query might look like: `SigninLogs | where TimeGenerated between (ago(30d) .. ago(1d)) | summarize DailyLogins = count() by bin(TimeGenerated, 1d), UserPrincipalName`. This creates a historical daily login count for each user.

Once a baseline is established, KQL is used to detect deviations and anomalies by comparing current activity against it, looking for statistically significant differences. This includes volume-based anomalies, such as a user account suddenly initiating 100 times more network connections than their historical average. KQL can calculate standard deviations (`stdev()`) from the mean (`avg()`) of past behavior and identify current activity that falls outside a defined number of standard deviations, for example, `where CurrentActivity > (AvgActivity + 3 * StDevActivity)`. Behavioral context anomalies include location/IP-based anomalies, where a user logs in from a country or IP address never seen before, identified by comparing `SourceIPAddress` against a historical list of distinct source IPs for that user. Example: `SigninLogs | summarize makeset(SourceIPAddress) by UserPrincipalName` to build known IPs, then `where CurrentIP not in (KnownIPs)`. Time-of-day/Day-of-week anomalies detect actions outside normal working hours, for example, an administrative account active at 3 AM on a Sunday when it is normally weekdays 9-5, using `hour(TimeGenerated)`. Resource access anomalies identify a user or service principal accessing a resource type or specific resource it has historically never interacted with. Peer group analysis compares an entity's behavior against others in its group, like one server processing 10x more data than similar servers. KQL can `join` data to compare an entity's metrics to the average of its group. Rare events identify actions that are inherently uncommon or have zero historical occurrences for a specific entity, such as a standard user accessing a highly sensitive share for the first time, often found by using `dcount()` on `Activity` types for a given `User`.

KQL also includes built-in machine learning operators for time series anomaly detection. The `series_outliers()` operator detects outliers in a time series of numerical values by applying statistical algorithms, flagging points that deviate significantly from the expected pattern. For example: `MyTable | make-series Value = count() default 0 on TimeGenerated from ago(7d) to now() step 1h | extend (anomalies, score, baseline) = series_outliers(Value)`. The `autocluster()` operator finds common patterns (clusters) of attributes in the data and can indirectly highlight unique rows that do not fit into any dominant cluster, pointing to unusual events.

Once an anomaly detection query is refined in KQL, it can be deployed in Sentinel as an Analytic Rule. These rules run on a schedule, automatically executing the KQL query, and generating security incidents when the defined anomalous conditions are met. This allows security operations center (SOC) analysts to investigate these newly discovered, potentially unknown threats. Experts also use KQL proactively in Hunting queries, iteratively refining their search for subtle anomalies that might indicate emerging attack techniques. These combined techniques enable experts to move beyond signature-based detection and proactively identify novel threats by observing unusual patterns of behavior across an organization's digital estate.

Sign up to see the full answer