Understanding XQL Data Sources and Structures in Cortex XDR

Introduction

In the realm of cybersecurity, having a robust data querying capability is essential. Cortex XDR (Extended Detection and Response) offers a powerful querying language known as XQL (Extended Query Language). This article dives into the foundational elements of XQL, focusing on data sources, structure, and syntax. By the end of this guide, you will understand how to utilize Cortex XDR for effective data analysis and incident response.

Understanding XQL Data Sources

Every XQL query operates against specific data sources. In Cortex XDR, data sources are primarily categorized into two types: data sets and presets. Each category offers unique functionalities that enhance query efficiency and accuracy.

What are Data Sets?

Data sets are collections of data stored within the Cortex XDR system. They contain raw events reported by the XDR agent as well as logs from a variety of sources. There are several types of data sets, including:

System Data Sets: Built-in data sets that come pre-configured with the product. For instance, the XDR data set is designed to store endpoint-related data.
User Data Sets: Custom data sets created by users, often by utilizing the target stage to save the results of specific queries.
Lookup Data Sets: Data sets created by importing CSV, TSV, or JSON files. These are typically used for referencing and querying additional information.
Raw Data Sets: Collected data from third-party sources, including network logs from NGFWs (Next-Generation Firewalls) and other external sources.
Correlation Data Sets: Generated from configured correlation rules within Cortex XDR.

What are Presets?

Presets, on the other hand, are subsets of data sets. They consist of extracted fields and provide an efficient means of querying by encapsulating only the necessary information. The benefits include:

Efficiency: By using presets, users can query against a smaller, relevant set of fields, improving the speed and relevance of results.
Types of Presets:
- Regular Presets: Typically consist of event logs categorized by specific operations like process execution or file operations.
- Story Presets: These combine logs from multiple sources into a unified schema, beneficial for comprehensive analytics. Examples include network story and authentication story.

XQL Structure

The XQL structure is integral to understanding how to write efficient queries. When crafting queries within Cortex XDR, the following components are crucial:

Query Development Environment

The XQL coding occurs within a designated development area, often referred to as the code editor. Here you can define your queries, set parameters, and view results.

XQL Syntax

The syntax of XQL is fairly straightforward. You will primarily deal with:

Fields: These are the specific data points you seek to analyze.
Filters: Conditions that refine your search to yield more precise results.
Stages: Different phases where you can shape your query. For instance, defining your data source as a data set or preset.

Incorporating Data into Your Queries

To effectively utilize data sources in XQL, consider the following:

Always specify the data set or preset from which you are querying, unless you are relying on the default data set.
Utilize the schema viewer in the code editor to reference the fields available in your chosen data set or preset.

Demos and Practical Examples

Having a theoretical foundation is important, but practical implementation drives the learning process. Here are some demo examples:

Example 1: Querying a Data Set

Open the XQL code editor.
Type a query referencing the specific data set (e.g., data set = "XDR Data").
Execute the query to view results.

Example 2: Saving Query Results to a User Data Set

Define your query to select specific data.
Add a Target type = data set directive.
Execute the query; results will be saved to the user-defined data set.

Example 3: Utilizing a Preset

Begin with a question: “What processes were executed during a specific timeframe?”
Start your query with a relevant preset (e.g., using the file_preset).
Fetch results rapidly due to the focused field selection.

Conclusion

Understanding XQL's data sources, structure, and syntax is pivotal for effective data analysis in Cortex XDR. By leveraging both data sets and presets, analysts can optimize their queries, improving efficiency and obtaining relevant insights quickly. As you continue to practice with the code editor and experiment with various queries, you'll enhance your skill in navigating the complexities of cybersecurity data analysis. Stay updated with the latest documentation to keep your knowledge current. Happy querying!

let's start with the second part with our training and it's going to be the xql building blocks with this section we

are going to talk about the xql data sources and we are going to talk about the xql structure and we will have a

demo we will talk about the xql syntax we will have a demo and we will talk about the xql schema and we will talk

about X options as well and we will have a demo starting with the xql data

sources as we did explain at the very beginning each query should run again against some sort of a data source it

has to be a data source that we are running our query against cortex xdr provides variety of data sources that

are mainly categorized into data set and presets and those are the main two types that we do see and we did see when we

went to the xql development environment and we saw the data sets and the presets so on General when we start

thinking about query when we start building our use case we are going to think about what data source we are

going to use we may use a data set or we may use a preset as we do see right here then after that we do start defining the

query or defining the stages this is where we are going to shape the query we start using the fields stting the

filters different stages that we are going to talk about in details so for our step right now which is defining

data source as we mentioned we have two main data sources the data set and the preset and we are going to talk about

the data set types as well and we are going to talk about the preset types but for now we do have two types of preset

that we can differentiate between which is the regular preset that we're going to talk about which is the like just a

group of fields from the larger data set or the story data set that is the Stitch preset with other data source that we

can see as well so we will go next and start talking about more information related to the

data sources and the types of information that we have in each so we have different data types for building

the queries this is just an example for what we have we have many many more that we can utilize but for example we can

get some data from the xdr agent and we are going to talk about that in details we are going to get like for example the

xdr agent data as we can see right here we can get information about Files about process about Network about registry uh

we are going to talk about Windows Event log that we are getting natively from the XTR agent we can also inest

additional Windows Event log also we can have we have different ways that we can also expl we go over in High Level we

can also get um logs from different sources something like the ngfws or we can get something from third party

sources to get to talk into that that in details we will have if we look at the basic structure for the cortex XTR and

we'll see the cortex XTR data layer this is where all the data are going to be stored at if we first see here the

endpoint data this is the endpoint that we mainly sorry the data that we get from the endpoint itself so that's our

first source that we use this is a cortex xdr agent then if we look here at the uh parall to network

sources those are things like the iots the P alter Network Next Generation firewalls the P alter Network Prisma

axis and global prot as well as other sources like third party sources when we start using the

cislaw collectors the net flow Etc we're going to talk about those another type also of data sets or data sources that

we can get are the custom data sets that we are going to create we can create either a user data sets those are the

one that we use by utilizing the target stage also we can use an imported data to create something called lookup data

set when we import the Json and CSV files this is also something that we will see in details and how we can do

that and how this is going to look like in the system after we do this to expand more on how we collect the data

especially the third party data because the first part which is the xdr agent is very easy and straightforward this is

data that comes from the agent itself as well as the ngfws the iots and all the pan power Al netw workk sources those

are also easy to see like how these are being collected now if we start talking about the third party data we do provide

uh lots of capabilities for us to collect those data starting with the broker VM broker VM does have great

capability of collecting data and as well we can do injest those data and start using them in XTR and we have

different data set for each type of data that we are collecting within broker VM we do have what we call the applets each

applet does a specific job of collecting a specific type of data for example we do have a CIS log collector that's under

the CIS log applet this is for collecting the CIS log data we do have the netf flow collector we do have the

Windows Event collector as well this is for the Windows Event log we do have the FTP collector CSV collector database

collector and also file and folder collector and many more these are always getting up updated and enhanced the more

we have more updates to the product so please refer always to the documentation to see the latest on this so that is one

of the sources to get the third party data ingested into XTR data layer the other method that we can do is the

elastic search File beat agent this is when we go under the data collection and then there is an option that says custom

collectors we do have the first option that says file beat this is for the file beat itself and this is the HTTP lock

collector as well that's another method that we can do we can expand more more on this once we process with the

training of course and then the other method is when we go to the data collection under data

collection where there is an option that says collection integration those are the SAS log collection that they have

the applets for ready for us just to start and input the configuration and we start connecting directly to those

applications those application are always enhanced there are always more added every time we do have an update

for the product so please as well for those consult the documentation to get the updated list on this but this is

another way for our data collection third party data collection the other method as well that gives you lots of

flexibility and in terms of scalability that's something we do also recommend as well is the xdr collector

and also the capabilities for The xdr Collector is always being enhanced so um recently actually we do have um we get

an update that you can also have a templates ready made for you so you can just use those templates to collect the

specific type of data for example if we are collecting the DHCP logs or if we are collecting the Windows Event log so

we do have a specific template for each of those and we can stack those as well so the xdr collect

is going to be under the xdr collector if you see right here in the screen and then after that you will see the options

to add The xdr Collector so to begin with you just need to have an installer with that installer you need to

distribute that on the end points or on the collection points where you need to do the collection and then you will

start creating the profiles and the policies and if needed you can create the group in order for this to be the

target for the policies and then once you configure the profile you determine what template that you are going to use

in order for it to start collecting the data for the f file beat after that XR collector it's going

to do the job it's going to start collecting the data and it's going to maintain the configuration for the file

beat for us so there is no more admin overhead for us to maintain the configuration on the collection Point

itself the XR collector is going to do that for you and that's going to be easily maintained and administered as

you can see here in the xdr collectors Administration page where you can see The Collector there the status for The

Collector and you can do the management operation on this collector as well from that

page moving on let's start talking about the xql structure as we start talking when or how we going

to access the xql is under the incident response investigation and the query Builder we will go ahead and access the

xql search button right here and you will see the xql development area so you will be presented with this white box

this is where you are writing the xql query so this is again what we call the xql development area or the xql code

editor then right beside that you will see the timeline panel you see right here and

you'll see the predefined time periods which is the 24 hours 7 Days 1 month or the custom this is where you go ahead

and you define the exact date and time for the begin and the end of your data search and your time frame then after

that you will see the xql option starting with the query result this is where you see the main table for the

query result and the xql helper this is where you get more help regarding a specific syntax

that you are looking for we will see an example for this and then after that you will see the query Library this is where

you can use the redmid queries for you that we have published by our research team or the queries that are being

shared by other team members or the queries that you just created and either shared with others or you just kept it

in your personal Library then you will see the schema that is for the specific data set that is in question right here

then the options for us to either do the save as either we do that for a widget we're doing this for a bio creu we're

doing this for a correlation Rule and also the options to run this or run it in the

background and we will definitely talk about these options in details in the coming slides

next let's talk about some differences between data sets and presets so we did Define them as a very high level before

now let's take deeper look into what a data set is and what a preset is first off to start with the data set this is

the native or this is the custom set of data that is stored in xdr so those are contains the raw ADR events reported by

the xdr agent also contain contains the logs from different sources like the third party sources or the parel network

and J FWS also prma this is also included there or the custom data set that we are going to talk about if we

look here at the right side for the screenshot there are types for the data set so this is mainly how we create the

data set this defines the type of the data set if you notice these system data sets those are the built-in data set

that come out of the box with the product that's nothing that you need to create yourself that's something that is

created for you and ready for us to be used the other type is the lookup type now if you look at the lookup type when

we import the CSV or the Json file or the tsv file the data set type for that specific file that we just imported is

going to be a lookup type the other type is the user this is when we use the target

stage in order for us to save the result for a specific query that we did write into another data set the other data set

that we just save the query results into that's going to be a user data set because we use the target stage for that

one once we start seeing this one will start this as a user data set also the other type for it is the raw data set

this is for third party and just a logs as well as also the P alter network

uh logs as well something like the ngfw the Prisma iots that's going to be also ingested as R logs the other type is the

correlation when we configure correlation rule there are two options for us either to create an alert or to

save the results into a data set if we choose to save the results into a data set that data set will have a type of

correlation and under the type column you will see that this specific de set that you have chosen within the

correlation rule configuration is being saved under type correlation we will talk about some of

these now coming to the slide and just take like quick example see how each of these look

like so for this one we are looking at the lookup type again this is when we do import a file which is a CSV tsv or a

Json file into a data set that data set is going to have a type of a lookup so if we do see right here in the example

this is the specific data set that we have imported So within the configuration when we click on the

lookup we will have this popup screen the name that we Define right here that is going to be the name of the data set

as we see right there and then we go ahead and import the file if you see that the file that has been imported and

we then go ahead and click on ADD once we click on ADD we will just wait on the notification panel to show us some

notification as we see right here at the bottom of the screen that says this lookup upload and it will give you that

specific name for the data set that we just have configured right there and it's going to tell you that this has

been uploaded successfully if you see an error right here that mean there is something wrong within the naming

convention for the fields or within the data so please go ahead and fix this based on the documentation we have have

a detailed list for what is allowed and what is not allowed within the naming so please make sure that the naming is goes

as per standard then after the data the data set is imported and you will see it right there and you will get the

successful message from the notification you can start now using the data set you can go to the

xql code editor area and you just type in the data set with the data set name that you just created and you will start

seeing the schema for the data set and you will start seeing the data within that data set and you can perform the

normal operation that you perform on this data set something like the join the union and you can start adding to

that one as well second tab that we're going to talk about right here is the when we use the

target is the user type so if we see right here we do have a specific query for example exle the results of this

query is going to be saved into another data set if we look at line number 13 that says Target type equal data set

then append to a data set that is called this name Z data set you can give it whatever name you want so what does that

mean that mean take the results of this query and save it into a brand new data set once this query is run and executed

you will see that there is a new data set right here where they type as you will see right there as a user that is

the same exact name that you just have defined in the configuration right there using the target stage and again that

data set can be taken and you can start doing the normal operation that you will do in it similar to any other data set

so let's take deeper look into that one if you take the data set in the xql code edit

and you just type data set name you will go ahead and do the schema you will see the fields right here one 2 three four

whatever number of the fields just notice that those fields are the fields that you did finally have in your query

so those are the fields that you end up with in your query and you have defined and you have filtered on so whatever the

final result for your field those are going to be the structure for the schema for the new data set that you have just

saved right there so if you click either you go to the data set click on the schema or you go from the configuration

data set Management on the data set itself do the right click and view schema you will see those fields that

you have defined in the query if you need to do any changes you can come to the original query do the changes and

execute that query again and it will append the results to the data set that you have right

there next we are going to talk about the straightforward ones the system type data set those again as we mentioned

those are the built-in data set that comes in with the product so nothing you need to do for in order for you to see

those data set those are there for you so you will see those are with a type system something like XTR data the

forensics data sets the host inventory data set the endpoints data set the are going to be showing as type system now

if we start talking about the third party logs this is the row type so if you see under the type it says row here

that's mean this is a third party third party that's means something that has been ingested into XTR other than the

agent for example we do have here the parter Network ngfws traffic undor raw just stopping here for a second to just

talk about the nameing convention for the parallel to produ product and you will see especially the ngfw you will

see it start with the vendor which is the par Auto Network and then the product and the specific about the

product which is the subtype then the raw like for example here par Network and gfw URL or threat or traffic then

underscore raw for other vendors they default name a convention it's going to be following vendor uncore product

underscore raw so you will see raw at the end of the default nameing convention same as to the type as well

here the first part of the name Convention as you will see in this example is going to be the vendor and

then this is the product for that specific vendor and another type that we're going

to talk about here is the correlation role data so if we take a quick look at this you will see that's an example for

correlation rule that had the data stored into a data set instead of like generating alerts and the name in the

configuration for that correlation rule was correlation uncore portor scan so that's why you will see the type here is

correlation so whenever that rule is executed we will have the data added to the correlation rule oh sorry the

correlation type data set that's under the name correlation undor portor scan similarly for all this data set we just

go to to the xql code editor type data set equal and then the data set name and we'll start seeing the data within that

specific data set and we will have a quick demo to show the example for this one now before we conclude the subject

that we're talking about the data set and let's just have that quick example for talking about a difference between

two queries that do have the same exact results but there is a line that's missing right here so you can just pause

the video for a second and try to find what is missing right here so we can talk about it why it's like

that okay so as you just noticed the data set line here line number two does exist but in the second query

we do not have line two so we do have a query that is running without data it but hold on a second you were tell

telling us that data set is very important and that's the first thing in your query that you need to think of and

there's no data running without data set or data source I'm sorry so how is that being running that's a very good

question because if you use something or you utilize the default data set then you

don't need to mention that data set let me show that in the configuration under data set

management once you open that page you will see something something like the data set name and you see the column

that says default with other columns if you do not have this default column please click on the three dots and make

sure the default column is checked out by default and as always xdr data is the default data set that's come out of the

box as a default data set if you want to change that one You definitely do have the option to if you have a use case for

this one but by default XTR data is the default so if we go back to our query that's why you will see the XTR data if

it was not mentioned in the query then the system is going to run your query against the default data set so just

make sure if you are not running against the default data set then you need to Define your query in the query otherwise

you may have an error if you define a field that does not belong to that specific data

set now let's talk about what a preset is so we did talk about this data set we talk about the type of data set we show

some some examples for it now let's talk about a preset simply the preset is just a subset of a data set so those are the

tables that were built by fields that were extracted from other data tables and also those are group of fields that

are useful for that specific type of preset so uh these fields within the preset are also available in the larger

data set so if we talk about the xdr file preset we will see like around 58 Fields those 58 fields are available on

the larger data set which is the xdr that has around 940 Fields the benefit of us using the preset for file that has

58 Fields instead of using the XTR data that has 940 Fields is to introduce efficiency for the US user so your query

will run faster and you will get the information that you are looking for faster and you will eliminate any

information that you do not need for that specific operation if I'm looking for process

execution then I need to see everything about the action process the actor process the causality process I don't

need to see for example something related to network traffic at least for the time being so it's good idea to

start with your preset first if you were able to get the information that's perfect if not then you can start using

the larger data set with presets we do have mainly two types the regular presets this when we

start looking for like event logs those are simply the group of fields from the larger data set image load Network

process execution registry file operations the other type is the story preset so if we go to the code editor

XEL code editor and WR preset equal story you will see two types the authentication story and the network

story the story type presets those are the Stitch logs and events together into common schema for example if we look at

the network story this one that contains the field from both the ngfws and the xdr Agents so if you have a use case

that will need a story tab flogs the network story can help you and the authentication story can help

you now let's do a quick for we what we just mentioned in that section

Heads up!

This summary and transcript were automatically generated using AI with the Free YouTube Transcript Summary Tool by LunaNotes.

Generate a summary for free

Related Summaries

Understanding the Diamond Model for Intrusion Analysis: A Comprehensive Overview

This video provides an in-depth introduction to the Diamond Model, an analytic methodology for intrusion analysis. The discussion covers practical use cases, the model's structure, and its application in threat intelligence, including a real-world use case with Drago's worldview intelligence.

Comprehensive Guide to HR Data Preparation in Analytics

Learn essential steps for preparing HR data for effective analysis and decision-making in this comprehensive guide.

Mastering Pandas DataFrames: A Comprehensive Guide

Learn how to use Pandas DataFrames effectively in Python including data import, manipulation, and more.

Mastering General Security Concepts for Security Plus Exam 2024

Dive into key concepts of security controls, change management, and cryptographic solutions for Security Plus Exam prep.

Understanding the LRDI Set: A Comprehensive Guide

In this video, the instructor discusses a challenging LRDI set, providing insights into the scoring system across various categories. Viewers are encouraged to attempt the set and share their results, while the instructor explains how to calculate averages and scores effectively.