Sentinel data lake: Old and New Table Tiers

Microsoft has just introduced Sentinel data lake (SDL) in public preview, and there’s already a flurry of excitement in the cybersecurity world. Most community blog posts so far focus on how to turn it on and when you might want to use it, but very few delve into how it will change your day-to-day experience – especially when it comes to how your data is organized and accessed. 

Rather than providing step-by-step instructions and how-to’s, this post will break down what the new data lake means at table level: how data is structured, how the different components interact, and what you should consider if you want to enable the data lake for your existing Sentinel environment. If you’re looking for information on where to find specific data after enabling data lake and how the architecture is evolving, you’re in the right place. 

The progression leading up to Sentinel data lake 

To truly appreciate the value of Sentinel data lake, it’s important to understand how Sentinel table plans (tiers) have evolved – what existed before, what was already available, and what has changed with the introduction of the SDL. 

Looking back, it’s clear that Microsoft had been moving toward alternative storage options for some time. Initially, there was only the single ‘Analytics’ tier, but then the ‘Basic’ log tier was introduced. This offered lower-cost storage and good performance, though at the expense of extra query costs and limited feature support. Despite its differences, ‘Basic’ logs still functioned much like ‘Analytics’ logs. 

Not long after – and still a relatively new feature – Microsoft launched the ‘Auxiliary’ tier. This log type gained more traction than ‘Basic’ due to even lower storage costs but came with significant tradeoffs: additional limitations, reduced feature compatibility, and slower query performance. These restrictions were a telltale sign that the data was stored in a fundamentally different way – in fact, ‘Auxiliary’ logs were already backed by Data Lake technology, though this wasn’t widely publicized. 

So, the recent announcement isn’t just about offering another lower-cost log option; that already existed with Auxiliary logs. The true innovation with Sentinel data lake lies in Microsoft offering a cost-effective, easy-to-use logging option combined with direct and effective access to the data. This empowers companies to perform extensive data analysis, including use cases that demand large volumes of data – such as AI and machine learning workloads:

  • Easy-to-use: Because you can easily switch a table from ‘Analytics’ tier to the new ‘Data Lake’ tier while you don’t have to change your logging infra at all. No connector change, no DCR change is needed. 
  • Direct access: Because with data lake you can query all your data via KQL queries or Jupyter notebooks hosted by MS instead of running unfriendly Search/Restore tasks. 

Because the major change isn’t in how or where the data is stored, but rather in how it’s accessed and used, most organizations won’t need a redesign of their logging architecture. If you’ve been using Analytics, Basic, or Auxiliary logs, you can generally keep your existing setup and simply extend it using the new capabilities Sentinel data lake provides. 

Sentinel SIEM vs Sentinel data lake tiers 

Enabling Data Lake introduces a new Table Management feature in Defender XDR, letting you work with the new ‘Data Lake’ tier and move tables between ‘Analytics’ and ‘Data Lake’. 

Once activated, you can manage tables both from the Log Analytics workspace’s Table page and from Defender XDR’s Table Management. Even though a banner may state that table configuration must now be done in Defender XDR, you can still make changes in Log Analytics; however, some features will be unique to each specific table management solution. 

Let’s begin our journey by exploring the available table tiers and gaining an understanding of their differences: 

 

  1. Basic table tier: Basic logs are not supported by Sentinel data lake. Though they appear in the Table Management page, they’re greyed out and not configurable. This is why ‘Basic’ tier is not listed on the Defender side of the image. 
  2. Analytics tier: This is Sentinel’s core log tier, storing all critical security data and providing advanced analytics. It’s available and consistent in both Log Analytics and Defender XDR Table Management, with both platforms using the same terminology.
  3. Auxiliary/data lake tier: These terms refer to the fundamentally same log type, despite the different names, which often causes confusion. An Auxiliary table created in Log Analytics will be listed as ‘Data Lake’ in Defender XDR. Also, changing a table from ‘Analytics’ to ‘Data Lake’ in Defender XDR will show it as ‘Auxiliary’ back in Log Analytics. Although the two may appear interchangeable there are some differences in the background. For example, Auxiliary logs generated directly in the Azure Portal do not support ‘dynamic’ fields. However, logs in the new Data Lake tier, which have been transitioned from the Analytics tier, are compatible with this field type. Microsoft now recommends against directly creating an Auxiliary log. Instead, you should create an Analytics table and then switch it to Data Lake mode to achieve the same result.
The diagram explains on which portal you can switch

To move a table from the Basic tier to Data Lake / Auxiliary, you’ll first need to switch it from Basic to Analytics in Log Analytics, and then from Analytics to Data Lake in Defender XDR. 

Data location and mirroring 

Table management is handled behind the scenes, with data storage, copying, and movement managed transparently to ensure a seamless experience. However, understanding where your data is stored remains important, as it affects default retention periods, available features, and potential limitations. 

The log types have been divided into six groups to make the diagram easier to understand. Each group will be explained after the diagram. 

Various logs types and their storage location
  1. Supported Sentinel Analytics table

The Analytics tier is a key component of Sentinel, delivering the most value for security operations. That’s why it’s important to understand how its behavior changes once Data Lake is enabled: 

  • Automatic mirroring: Once Microsoft Sentinel data lake is enabled, all supported Analytics tables are automatically mirrored to the Data Lake tier at no additional cost, starting from the point of activation. 
  • No retroactive mirroring: Only data created after Data Lake is enabled will be copied; existing data is not mirrored retroactively. 
  • Default retention: By default, analytics data is mirrored to the Data Lake with the same retention period (no extra cost), but retention in the Data Lake can be extended for up to 12 years at a low cost. 
  • Switching tiers: Only an Analytics table can be converted to the ‘Data Lake’ tier (Basic logs are not supported in the new Table Management tool); in this case, data stops being ingested into the analytics tier, but any previously stored data in analytics remains available until it expires per retention settings. 

The logs are present both in Sentinel and in Sentinel data lake. 

  1. Unsupported Sentinel Analytics table

Not all Analytics tables are supported by Data Lake, which can lead to some issues: 

  • No legacy tables in data lake: Legacy custom tables that aren’t based on Data Collection Rules (DCR) are not supported. These tables can’t be switched to the Auxiliary/Data Lake tier and won’t be mirrored to the Data Lake. 
  • Table management for legacy tables: While these unsupported tables can’t use the new Data Lake tier, you can still manage their retention settings in Defender XDR through the new Table Management feature. 
  • AzureDiagnostics table: Surprisingly, AzureDiagnostics is not supported for the Data Lake tier. Since this table often contains large volumes of cloud-based logs, it’s important to know that Data Lake currently doesn’t support it. 

These logs are only present in Sentinel. 

  1. Basic table

Basic tables have never gained much traction. While a few organizations tried them, the limited features often didn’t justify their – otherwise lower – cost, which may explain Microsoft’s lack of focus on this log type: 

  • No data lake support: Basic logs aren’t mirrored to or compatible with the Data Lake tier. Like unsupported Analytics tables, they cannot use Data Lake. 
  • No table management in Defender XDR: Although Basic logs appear in Defender XDR, all configuration options are disabled. Any changes must be made in the Log Analytics workspace. 

Since the introduction of SDL, Microsoft has begun recommending that users transition away from the Basic tier and start using either the Analytics tier or the new ‘Data Lake’ tier. 

 

The logs are only present in Sentinel

  1. Sentinel Auxiliary table / Data Lake tier

Auxiliary logs are inherently part of the Data Lake. In the Defender XDR portal, Sentinel’s Auxiliary logs appear as Data Lake tier logs.  

  • Data Lake support: Since these logs are designed for the Data Lake, they are stored exclusively in this tier, making them a cost-effective storage option. 
  • Moving to SDL: All the Auxiliary tables you created prior to enabling SDL will automatically be migrated to SDL once it is activated. No manual steps are required on your part. 

The logs are only present in Sentinel data lake. 

  1. Defender logs

These events are produced by the various Defender products. 

  • Stored in Defender: Defender generates these logs and retains them for 30 days within its own environment.
  • No direct data lake support: These logs cannot be moved or configured as ‘Data Lake’ tier logs, so they are not stored or replicated in the Data Lake 
  • Table types: Defender provides two types of log tables. The ‘XDR’ table type has a fixed 30-day retention and cannot be customized in the portal. Tables marked as ‘Sentinel’ can have their retention settings adjusted; if you select a retention period longer than 30 days, the logs are forwarded to Sentinel as Analytics logs, which generates ingestion costs.

 

  • Streaming API: Adjusting retention and forwarding logs to Sentinel requires a Streaming API slot for some tables. Without a free Streaming API slot, you will see an error when trying to change the retention. 

When data is forwarded to Sentinel as ‘Analytics’ data, it is mirrored to the Data Lake, including these specific tables. So, you can send Defender data to ‘Data Lake’ tier this way. But, it is not possible to bypass the costly ‘Analytics’ tier natively. Workarounds are available to achieve direct integration with the Data Lake tier by using custom DCR-based tables. This can be accomplished either by routing the Advanced Hunting data through an Event Hub and then into Sentinel, or by using log splitting in DCR to redirect all logs to the custom table while preventing them from being forwarded to Sentinel’s native Advanced Hunting tables.

The logs are only present in Defender; if exported, they appear in Sentinel and are mirrored to Data Lake. 

  1. Microsoft Asset logs

Microsoft designates certain logs as asset logs. Once Data Lake is enabled in your Azure tenant, these logs are automatically sent to the Data Lake. 

The following types of data relate to your Microsoft assets: 

  • Microsoft Entra
  • Microsoft 365
  • Azure Resource Graph 

The logs are only present in Sentinel data lake. 

Data access 

Now that we understand where data is stored, it’s crucial to recognize how access controls have changed – and how they will continue to change according to Microsoft’s documentation. 

The diagram below illustrates where specific Sentinel data sets are accessible both before and after enabling Sentinel Data Lake. In the diagram, lines in general depict my current experience with accessing the datasets after enabling SDL, while dashed lines indicate accesses that, according to Microsoft, will eventually disappear. Although this anticipated disappearance is mentioned in some documentation, as of now, the data remains accessible from those locations.

Where different Sentinel data sets can be accessed from

Key updates and changes based on my tests: 

  • Analytics: Supported Analytics tables are now automatically duplicated to Data Lake, providing an extra access method once Sentinel Data Lake is activated. Accessible from every portal. 
  • Basic: This log type isn’t available in Data Lake, but accessible in Sentinel and through the Advanced Hunting page. 
  • Auxiliary logs: At present, you can query these logs from all three places: Sentinel SIEM, the Advanced Hunting page, and the new Sentinel Data Lake. 
  • Legacy tables: These logs aren’t compatible with Data Lake and thus won’t appear there at all. But available in Sentinel and Advanced Hunting. 

As per Microsoft’s documentation: 

  • Basic: Microsoft states that Basic logs won’t be supported in the Table Management solution, and they won’t be accessible through Data Lake KQL or Advanced Hunting.   –>   However, Basic logs still show up in Advanced Hunting at this time. 
  • Auxiliary: Microsoft notes that: “When a customer has onboarded to both Defender and Microsoft Sentinel onboards to the Data Lake, auxiliary log tables are no longer visible in Microsoft Defender’s Advanced hunting or in the Microsoft Sentinel Azure portal.”   –>   My tests show that auxiliary logs remain fully accessible in the Sentinel portal; and although they disappear from the Advanced Hunting Schema tab and the code autocomplete features stop working the data is still searchable. 

Contrary to Microsoft’s guides, all log types remain accessible (for now), but the disappearance of Auxiliary logs from the Schema tab signals that changes are imminent. Although you can still access Auxiliary log data for now, it’s important to begin preparing your engineers and analysts to adapt to these upcoming changes and know where to find specific data in the future. 

Sentinel Data Lake is still in preview, so you should anticipate frequent changes, ongoing enhancements to existing features, new capabilities, and further updates from Microsoft. Be sure to plan your environment with this evolving landscape in mind.

For those looking to delve deeper into how Sentinel data lake can transform and optimize your logging and monitoring pipeline, BlueVoyant is here to assist. Our team of experts is ready to help you navigate the complexities and unlock the full potential of your data infrastructure. Looking for tailored guidance and insights? Contact us

Close