Splunk Knowledge
Notes taken from Splunk official trainings. It serves as a quick reference for Splunk Search commands and tricks.
Splunk Brief¶
What Splunk is good for¶
Monitoring, alerting, debugging issues and outages, generating reports and insights.
Its workflow is roughly: Index Data -> Search & Investigate -> Add Knowledge -> Monitor & Alert -> Report & Analyze
Splunk Data Sources can be from: Computers, network devices, virtual machines, Internet devices, commnuication devices, sensors, databases, logs, configurations, messages, call detail records, clickstream, alerts, metrics, scripts, changes, tickets, and more.
Splunk Modules¶
- Indexer - Process incoming data and string results in indexes as events
- Create files organized in sets of directories by age
- When searching, Splunk only open the dirs that match the time frame of the search
- Search Heads - all users to search the data using Splunk Search Language
- Takes the search, divide and conquor and distribute work to indexers, and aggregate the results got back
- Also serve tools like dashboards, alerts, reports
- Forwarders - Splunk Enterprise component that consumes data and forward to the indexers for process and save
- Usually resides on the machines where the data origins and requires negligible resources
- Primary way data is supplied for indexing
For a large scale of data sources, it is better to split the Splunk installation into multiple specialized instances: multiple Search Heads and Indexers (which can form clusters), and many Forwarders.
Read More: https://splunkbase.splunk.com/ and http://dev.splunk.com/
Splunk User Types¶
- Admin User - access to install apps, create knowledge objects for all users
- Power User - access to create and share knowledge objects for users of an app and do realtime searches
- User - only see their own knowledge objects and those shared with them
- knowledge objects created must be made share-in-app by Power User for it to be seen by other Users
Get Data into Splunk¶
- Admin User upload data files through web UI
- Configure Splunk to monitor some files locally on the machine
- Event Logs, File System changes, Active Directory, Network Info
- (Main) Receive data from external Forwarders
- Read more: https://docs.splunk.com/Documentation/Splunk/latest/Data/Usingforwardingagents
How Data is stored¶
Splunk store data in indexes which held data in groups of data buckets.
Bucketing¶
Data are stored in Hot buckets as they arrive, then Warm buckets and time elapses and finally Cold buckets. Eventually they go to Frozen buckets for deletion or archiving.
Each bucket contains:
- a journal.gz file, where Splunk stores raw event data, composed of many smaller and compressed slices (each about 128 kb)
- time-series index (.tsidx) files, serves as index keys to the journal file
- Splunk uses this to know which slice to open up and search for events
- created from raw events, where each line is tokenized with the token words written out to a lexicon which exists inside each tsidx file
- and each lexicon has a posting array which has the location of the events in the journal file
Bloom Filters¶
Bloom Filters is created based on tsidx files when a bucket roll over from Hot to Warm
- it is a data structure for quick elimination of data that doesn't match a search
- allows Splunk to AVOID accessing disk unless really necessary
- when creating a Bloom Filter, each term inside a bucket's tsidx files' lexicon is run through a hashing algorithm
- resulting hash sets the bits in the Bloom filter to 0/1
- When a search is run, it generates its own Bloom filter based on the search terms and compared with the buckets' Bloom Filters, resulting a fast filter on unrelevant buckets
Segmentation¶
The process of tokenizing search terms at search-time. Two stages:
- splitting events up by finding major/minor breaker characters
- major breakers are used to isolate words, phrases, terms, and numerical data
- example major breakers:
space \n \r \t [] ! ,
- example major breakers:
- minor breakers go through results in the first pass and break them up further in the second pass
- example minor breakers:
/ : . - $
- example minor breakers:
- major breakers are used to isolate words, phrases, terms, and numerical data
- The tokens become part of tsidx files lexicons at index-time and used to build bloom filters
Read more from doc page
Inside a Search¶
Take this search as an example:
index=security failed user=root
| timechart count span=1h
| stats avg(count) as HourlyAverage
- Splunk extracts 'failed' and 'root' and creates a Bloom Filter from them
- Splunk then identify buckets in
security
index for the past 24h - Splunk compares the search Bloom Filter with those from the buckets
- Splunk rules out tsidx files that don't have matches
- Splunk checks the tsidx files from the selected buckets for extracted terms
- Splunk uses the terms to go through each posting list to obtain seek address in journal file for raw events
- Splunk extracts search-time fields from the raw events
- Splunk filters the events that contains our search terms 'failed' and 'root'
- The remaining events are search results
Inspect a Search¶
Within the Job Inspector:
command.search.index
gives the time to get location info from tsidx filescommand.search.rawdata
tells the time to extract event data from the gzip journal filescommand.search.kv
tells the time took to perform search-time field extractionscommand.search.filter
shows the time took to filter out non-matching eventscommand.search.log
look for phrase "base lispy" that contains the Lispy Expression created from the query and is used to build Bloom Filter and locate terms in tsidx files
Search and Report App¶
The Search & Report app allows you to search and analyze data through creating knowledge objects, reports, dashboards and more.
Search UI¶
Events
tab displays the events returned for your search and the fields extracted from the events.
patterns
tab allows you to see patterns in your data.
If your search generates statistics or visuals, they will appear in Statistics
and Visualization
tabs. Commands that create statistics and visualizations are called transforming commands, they transform data into data tables.
By default, a search is active & valid for 10 minutes, otherwise needs rerun.
A shared search job is active for 7 days and its first-ran result visible by anyone you shared with.
From the web UI, use Search & Reporting
-> Search History
to quickly view recent searches and reuse the searches if possible.
Also from the web UI, use Activity
-> Jobs
to view recent search jobs (with data). Then further use Job -> inspect job
to view the query performance.
Search Fields¶
Search fields are key=value
string in the search query and will be automatically extracted and available for use by commands for creating insights. The key is case-sensitive but the value is not.
Three fields ALWAYS available are hosts
, source
, and sourcetype
. Interesting fields are those appear within at least 20% of the events
From the extracted fields, a
denotes a String value, #
denotes a numeral value
In the search, = !=
can be used on numeral and String values; > >= < <=
can be used on numeral values only.
Tags¶
Tags are Knowledge Objects allow you to designate descriptive names for key-value
pairs.
It is useful when you have logs that multiple fields may have the same value(s) you want to filter with. Then you can tag those fields with the same tag name and in future searches do filter with the tag name.
Tags are value-specific and are case-sensitive. When you tag a key-value pair you are associating a tag with that pair. When the field appears with another value then it is not tagged with this tag name.
Do NOT tag a field having infinite possible values.
Steps: in a Splunk search, click on an event and see the list of key=vlaue pairs and in actions dropdown select Edit Tags
to add new tags (comma separated).
To search with tags, enter tag=<tag_name>
. For example, index=security tag=SF
matches all fields that are tagged with that value. To limit a tag to a field do tag::<field>=<tag_name>
Read more about tags in doc page
Event Type¶
Event Type allows saving a search's definition and use the definition in another search to highlight matching events. It is good for categorizing events based on search terms. The priority settings affects the display order of returned results.
Steps: after a search, save as "Event Type" and use it in another search with eventtype
. For example, index=web sourcetype=access_combined eventtype=purchase_strategy
Knowledge Objects¶
Knowledge Objects are like some tools to help discover and analyze data, which include data interpretation, classification, enrichment, normalization, and search-time-mapping of knowledge (Data Models).
Saved Searches, Reports, Alerts, and Dashboards are all Knowledge Objects.
Field Extractor¶
The Field Extractor allows to use a GUI to extract fields that persist as Knowledge Objects and make them reusable in searches.
It is useful if the information from events are badly structured (not in some key=value
fashion), such as simple statements in a natural language but contains useful information in a specific pattern.
Fields extracted are specific to a host, source, or sourcetype
and are persistent. Two different ways a field extractor can use: regex and delimiters:
- regex: using regular expression, work well when having unstructured data and events to extract fields from
- delimiters: a delimiter such as comma, space, or any char. Use it for things like CSV files
Three ways to access a field extractor: - Settings -> Fields - The field sidebar during a search - From an event's actions menu (easiest)
Read more from the docs page for Splunk regex.
Field Aliases¶
Field Aliases give you a way to normalize data over multiple sources.
For example, when aggregating results from multiple sources or indexes, but the data logged may have fields with different names such as url, uurl, dest_url, service_url, etc. which all refers to the same thing. Then this tool is handy.
You can create one or more aliases to any extracted field and can apply them to lookup.
- Steps: Settings -> Fields -> Field aliases (add new)
- set destination app
- name (choose a meaningful name for the new field)
- apply to (sourcetype, source, or host) and its matching value
- Map fieldName=newFieldName
- Map multiple fields by adding another field
Calculated Field¶
Calculated Field saves frequently used calculation you would do in searches with complex eval statements and it is like a shorthand for a function made of SPL commands on extracted fields.
For example, you might want to save something to do conversion between bytes to megabytes, or encoding some field to sha256, etc.
- Steps: Settings -> Fields -> Calculated fields (add new)
Search Macros¶
Macros are reusable search strings or portions of search strings. It is useful for frequent searches with complicated search syntax and can store entire search strings.
You can pass arguments (fields or constants) to the search.
Steps: Settings -> Advanced Search -> Search macros (add new)
Then use it in the search by surrounding the macro name with backticks. For example: index=sales sourcetype=vendor_sales | stats sum(sale_price) as total_sales by Vendor | ``convertUSD``
Within a search, use ctrl-shift-E
or cmd-shift-E
to preview and expand the macro used without running it.
Datasets¶
Datasets are small subsets of data for specific purposes, defined like tables with fields names as columns and fields values as cells.
Lookups¶
Lookups are in fact datasets that allow you to add useful values to your events not included in the indexed data.
Field values from a Lookup are case sensitive by default
Set it up in Splunk UI -> Settings -> Lookups -> Define a lookup table
Use | inputlookup <lookup_definition_name>
to verify lookup is setup correctly.
Additionally, you can populate lookup table with search results or from external script or linux executable. Read more about external lookups and KVstore lookups
Some tips for using lookups:
- in lookup table, keep the key field the first column
- put lookups at the end of the search
- if a lookup usage is very common in search, make it automatic
- use KV store or gzip CSVs for large lookup table
- only put data being used in lookup table
- the
outputlookup
can output results of search to a static lookup table
Pivot¶
Pivot allows users to get knowledge from the data without learning SPL in deep; a UI-based tool can help the users with creating useful searches. It is like a tool created by knowledgable Splunk users to serve the users don't know much about Splunk.
Data Models¶
Data Models are made of Datasets and are knowledge objects that provide the data structure that drives Pivots. It can also be used to significantly speed up searches, reports, and dashboard.
Data Models must be created by Admins and Power Users and are hierarchically strucutred datasets. Starting from a Root Object (a search with index and sourcetype), then add Child Objects (a filter or constraint).
Data Models are good for accelerating searches. Doing searches on data models are like searching on some cached events that gets updated regularly.
Steps: Settings -> Data Models -> New Data Model
Then search the datamodel with | datamodel keywords
Read more from the doc page
Splunk Search Language (SPL)¶
SPL Components¶
- Search Terms - defines what data to return
- i.e.
host=xxx status>399
- also includes Boolean operators (
NOT OR AND
in its evaluation precedence)
- i.e.
- Commands - process search results and, create charts, compute statistics, and format results
- i.e.
eval
ineval foo=1
- i.e.
- Functions - serves supplying additional conditions and algorithms to compute and evaluate the results in Commands
- i.e.
avg
instats avg(latency) by host
- i.e.
- Arguments - variables we want to apply to the Functions
- i.e.
latency
instats avg(latency) by host
- i.e.
- Clauses - how we want the results grouped
- i.e.
by
instats avg(latency) by host
- i.e.
Search Terms, Command names, Function names are NOT case-sensitive, with the exception to the command replace
where search term must be an exact match.
Search Terms¶
- Keywords, any keywords can be searched on; "AND" boolean is implied for multiple keywords
- Booleans, AND OR NOT
- Phrases, keywords surrounded by ""
- Fields, key=value pair search
- Wildcards, used in keywords and fields to match any chars; inefficient if used at the beginning of a word
- Comparisons, = != < <= > >=
Subsearch¶
Subsearch is a search enclosed in brackets that passes its results to its outer search as search terms.
The return
command can be used in subsearch to limit the first few number of values to return, i.e. | return src_ip
. The field name is included by default; to omit returning values with the field name, use $
, i.e. | return 5 $src_ip
Example
index=security sourcetype=linux_secure "accepted"
[search index=security sourcetype=linux_secure "failed password" src_ip!=10.0.0.0/8
| stats count by src_ip
| where count > 10
| fields src_ip]
# when subsearch returns, Splunk implicitly adds "OR" between each two results
# then whole results is combined with the outter search with an implicit "AND" operator
| dedup src_ip, user
| table src_ip, user
# subsearch with 'return' command
index=sales sourcetype=vendor_sales
[ search index=sales sourcetype=vendor_sales VendorID<=2000 product_name=Mediocre*
| top limit=5 Vendor
| return 5 Vendor ]
| rare limit=1 product_name by Vendor
| fields - percent, count
Some limitations of subsearches:
- default time limit of 60 seconds to execute
- right after time limit, result is finalized and returned to outer search
- default returns up to 10000 results
- if the outter search is executed real-time, its subsearch is executed for all-time
- best to use
earliest
andlatest
time modifiers in subsearch
- best to use
SPL Commands¶
Transforming vs. Streaming Commands
- transforming commands operate on the entire result set of data
- executes on the Search Head and waits for full set of results
- change event data into results once its complete
- example commands
stats timechart chart top rare
- streaming commands has two types
- centralized (aka "Stateful Streaming Commands")
- executes on the Search Head
- apply transformations to each event returned by a search
- results order depends on the order when data came in
- example commands
transaction streamstats
- distributable
- executes on Search head or Indexers
- order of events does not matter
- when preceded by commands that have to run on a Search Head, ALL will run on a Search Head
- example commands
rename eval fields regex
- centralized (aka "Stateful Streaming Commands")
Read more from doc page
Streaming Commands¶
append¶
Add results by appending subsearch results to the end of the main search results as additional rows. Does NOT work with read-time searches.
It is useful to draw data on the same visualization by combining results from two searches. However, rows are NOT combined based on the values in some of the fields.
Example
index=security sourcetype=linux_secure "failed password" NOT "invalid user"
| timechart count as known_users
| append
[search index=security sourcetype=linux_secure "failed password" "invalid user"
| timechart count as unknown_users]
# Final result data may look awkward as rows are NOT combined based on the values in some of the fields.
# Fix it with the `first` Function - return the first value of a field by the order it appears in the results.
| timechart first(*) as * # apply first Function on all fields and keep their original name
appendcols¶
Add results of a subsearch to the right of the main search's results. It also attempts to match rows by some common field values, but it is not always possible.
Example
index=security sourcetype=linux_secure "failed password" NOT "invalid user"
| timechart count as known_users
| appendcols
[search index=security sourcetype=linux_secure "failed password" "invalid user"
| timechart count as unknown_users]
# "unknown_users" is added as a new column to the outter search's results and matches the _time field
appendpipe¶
Append subpipeline search data as events to your search results. Appended search is NOT run until the command is reached.
appendpipe
is NOT an additional search, but merely a pipeline for the previous results. It acts on the data received and appended back new events while keeping the data before the pipe. It is useful to add summary to some results.
Example
index=network sourcetype=cisco_wsa_squid (usage=Borderline OR usage=Violation)
| stats count by usage, cs_username
| appendpipe
[stats sum(count) as count by usage
| eval cs_username= "TOTAL: ".usage]
| appendpipe
[search cs_username=Total*
| stats sum(count) as count
| eval cs_username = "GRAND TOTAL"]
| sort usage, count
addtotals¶
Compute the SUM of ALL numeric fields for each event (table row) and creates a total
column (default behavior).
You can override the default behavior and let it calculate a sum for a column, see the example.
Example
# adds a column to calculate the totals of all numeric fields per row
index=web sourcetype=access_combined
| chart sum(bytes) over host by file
| addtotals fieldname="Total by host"
# adds a row to calculate the totals of one column
index=web sourcetype=access_combined
| chart sum(bytes) as sum over host
| addtotals col=true label="Total" labelfield=sum
bin¶
Adjust and group numerical values into discrete sets (bins) so that all the items in a bin share the same value.
bin
overrides the field value on the field provided. Use a copied field if need the original values in subsequent pipes.
Example
index=sales sourcetype=vendor_sales
| stats sum(price) as totalSales by product_name
| bin totalSales span=10 # span is the bin size to group on
| stats list(product_name) by totalSales
| eval totalSales = "$".totalSales
dedup¶
Remove duplicated events from results that share common values from the fields specified.
Example
index=security sourcetype=history* Address_Description="San Francisco"
| dedup Username # can be a single field or multiple fields
| table Username First_Name Last_Name
erex¶
Like an auto field extractor, but had the same shortcomings as the regex field extractor through Splunk UI: only look for matches based on the samples provided. Better use rex
command.
Example
index=security sourcetype=linux_secure "port"
| erex ipaddress examples="74.125.19.106, 88.12.32.208"
| table ipaddress, src_port
eval¶
Calculate and manipulate field values in many powerful ways and also works with multi-value fields. Results can create new fields (case-sensitive) or override existing fields.
Multiple field expressions can be calculated on the same eval command, and one is created/altered after another so the order of these expressions matters.
Besides the many Functions it supports, eval
supports operators include arithmetic, concatenation (using .
), and boolean.
Example
index=network sourcetype=cisco_wsa_squid
| stats sum(sc_bytes) as Bytes by usage
| eval bandwidth = round(Bytes/1024/1024, 2)
# eval with if conditions and cases
| eval VendorTerritory=if(VendorId<4000, "North America", "Other")
| eval httpCategory=case(status>=200 AND status<300, "Success", status>=300 AND status<400, "Redirect", status>=400 AND status<500, "Client Error", status>=500, "Server Error", true(), "default catch all other cases")
# eval can be used as a Function with stats command to apply conditions
index=web sourcetype=access_combined
| stats count(eval(status<300)) as "Success", ...
Supported Functions see docs page
eventstats¶
Generates statistics for fields in searched events and save them as new fields in the results. Each event will have the SAME value for the statistics calculated.
Example
index=web sourcetype=access_combined action=remove
| chart sum(price) as lostSales by product_name
| eventstats avg(lostSales) as averageLoss
| where lostSales > averageLoss
| fields - averageLoss
| sort -lostSales
Supported Functions see docs page
See also streamstats
command.
fieldformat¶
Format values WITHOUT changing characteristics of underlying values. In other words, only a format is specified when the value is displayed as outputs and the same field can still participate calculations as if their values are not changed.
Example
index=web sourcetype=access_combined product_name=* action=purchase
| stats sum(price) as total_list_price, sum(sale_price) as total_sale_price by product_name
| fieldformat total_list_price = "$" + tostring(total_list_price)
Supported Functions see docs page. Specifically look for Conversion Functions and Text Functions.
fields¶
Include or exclude fields from search results to limit the fields to display and also make search run faster.
Field extraction is one of the most costly parts of searching in Splunk. Eliminate unnecessary fields will improve search speed since field inclusion occurs BEFORE field extraction, while field exclusion happens AFTER field extraction.
Internal fields like raw and _time will ALWAYS be extracted, but can also be removed from the search results using fields
.
Example
index=web sourcetype=access_combined
| fields status clientip # only include fields status and clientip
index=web sourcetype=access_combined
| fields - status clientip # to exclude fields status and clientip
fillnull¶
Replaces any null values in your events. It by default fills NULL with 0
and it can be set with value="something else"
Example
index=sales sourcetype=vendor_sales
| chart sum(price) over product_name by VendorCountry
| fillnull
foreach¶
Run templated subsearches for each field in a specified list, and each row goes through the foreach command and subsearch calculation.
Example
index=web sourcetype=access_combined
| timechart span=1h sum(bytes) as totalBytes by host
| foreach www* [eval '<<FIELD>>' = round('<<FIELD>>'/(1024*1024),2)]
# '<<FIELD>>' refers to the value of the column_name
# assume 'host' gives values www1 www2 www3
join¶
Combine results of subsearch with outter search using common fields. It requires both searches to share at least one field in common.
Two types of join:
- inner join (default) - only return results from outter search that have matches in the subsearch
- outer/left join - returns all results from the outter search and those have matches in the subsearch; can be specified with argument
Example
index="security" sourcetype=linux_secure src_ip=10.0.0.0/8 "Failed"
| join src_ip
[search index=security sourcetype=winauthentication_security bcg_ip=*
| dedup bcg_workstation
| rename bcg_ip as src_ip
| fields src_ip, bcg_workstation]
| stats count as access_attempts by user, dest, bcg_workstation
| where access_attempts > 40
| sort - access_attempts
See also union
lookup¶
Invoke field value lookups.
Example
# http_status is the name of the lookup definition
# code is one of the columns in the csv, and we have status in our event
# defualt all fields in lookup table are returned except the input fields
# specify OUTPUT to choose the fields added from the lookup
index=web sourcetype=access_combined NOT status=200
| lookup http_status code as status,
OUTPUT code as "HTTP Code",
description as "HTTP Description"
makemv¶
Create a multi-value field from an existing field and replace that field. It splits its values using some delimiter or regex.
Example
index=sales sourcetype=vendor_sales
| eval account_codes=productId
| makemv delim="-", account_codes
| dedup product_name
| table product_name, productId, account_codes
# same results can be achieved by using regex and specifying capture groups
| makemv tokenizer="([A-Z]{2}|[A-Z]{1}[0-9]{2})", account_codes
multikv¶
Extract field values from data formatted as large and single events of tabular data. Each row becomes an event, header becomes the field names, and tabular data becomes values of the fields.
Example of some data event suitable:
Name Age Occupation
Josh 42 SoftwareEngineer
Francine 35 CEO
Samantha 22 ProjectManager
Example
index=main sourcetype=netstat
| multikv fields LocalAddress State filter LISTEN
| rex field=LocalAddress ":(?P<Listening_Port>\d+)$"
| dedup Listening_Port
| table Listening_Port
mvexpand¶
Create separate event for each value contained in a multi-value field. It reates new events count as of the number of values in a multi-valued field and copy all other values to the new evnets.
Best to remove unused fields before the mvexpand
command.
Example
index=systems sourcetype=system_info
| mvexpand ROOT_USERS
| dedup SYSTE ROOT_USERS
| stats list(SYSTEM) by ROOT_USERS
rename¶
Rename one or more fields. After the rename, subsequent commands must use the new names otherwise operation won't have any effects.
Note that after rename
if a field name contains spaces, you need to use single quotes on the field name to refer to it in subsequent commands.
Example
index=web sourcetype=access* status=200 product_name=*
| table JESSIONID, product_name, price
| rename JESSIONID as "User Session Id",
product_name as "Purchased Game",
price as "Purchase Price"
rex¶
Allow use regex capture groups to extract values at search time. By default, it uses field=_raw
if not provided a field name to read from.
Example
index=security sourcetype=linux_secure "port"
| rex "(?i) from (?P<ipaddress>[^ ]+)"
| table ipaddress, src_port
index=network sourcetype=cisco_esa
| where isnotnull(mailfrom)
| rex field=mailfrom "@(?P<mail_domain>\S+)"
Better examples can be found at doc page
search¶
Add search terms further down the pipeline. In fact, the part before the first |
is itself a search
command.
Example
index=network sourcetype=cisco_wsa_squid usage=Violation
| stats count(usage) as Visits by cs_username
| search Visits > 1
sort¶
Organize the results sorted by some fields.
Example
sourcetype=vendor_sales
| table Vendor product_name sale_price
| sort - sale_price Vendor
# the '-' here affects all fields
| sort -sale_price Vendor
# sort-decending will only affect sale_price while Vendor is sorted ascending
| sort -sale_price Vendor limit=20
# will also limit the results for the first twenty in sorted order.
spath¶
Extracts info from semi-structured data (such as xml, json) and store in fields. It by default uses input=_raw
, if not provided a field name to read from.
Example
index=systems sourcetype=system_info_xml
| spath path=event.systeminfo.cpu.percent output=CPU
# working with json, values in array specified with {}
index=systems sourcetype=system_info_json
| spath path=CPU_CORES{}.idle output=CPU_IDLE
| stats avg(CPU_IDLE) by SYSTEM
# spath Function
index=hr sourcetype=HR_DB
| eval RFID=spath(Access, "rfid.chip"), Location=spath(Access, "rfid.location")
| table Name, Access, RFID, Location
Better examples see doc page
streamstats¶
Aggregates statistics to your searched events as Splunk sees the events in time in a streaming manner. In other words, it calculates statistics for each event cumulatively at the time the event is seen.
Three important arguments to know:
- current - setting to 'f' tells Splunk to only use the field values from previous events when performing statistical function
- window - setting to N tells Splunk to only calculate previous N events at a time; default 0 means uses all previous events
- time_window - setting to a time span to only calculate events happen in every that time span; with this function, events must be sorted by time
Example
index=web sourcetype=access_combined action=purchase status=200
| stats sum(price) as orderAmount by JESSIONID
| sort _time
| streamstats avg(orderAmount) as averageOrder current=f window=20
# for each event, calculate the average from the previous 20 events' orderAmount field
table¶
Specify fields kept in the results and retains the data in a tabular format. Fields appears in the order specified to the command.
Example
index=web sourcetype=access_combined
| table status clientip
transaction¶
Transaction is any group of related events for a time span which can come from multiple applications or hosts. Events should be in a reversed chronological order before running this command.
transaction
command takes in one or multiple fields to make transactions, and creates two fields in the raw event: duration
(time between the first and last event) and eventcount
(number of events)
transaction
limits 1000 events per transaction by default and is an expensive command. Use it only when it has greater value than what you can do with stats
command.
Use keepevicted=true
argument to keep incomplete transactions in results and closed_txn
field will be added to indicate transactions incomplete.
Example
index=web sourcetype=access_combined
| transaction clientip startswith=action="addtocart" endswith=action="purchase" maxspan=1h maxpause=30m
# startswith and endswith should be valid search strings
# compare the two searches
# using transaction
index=web sourcetype=access_combined
| transaction JESSIONID
| where duration > 0
| stats avg(duration) as avgd, avg(eventcount) as avgc
| eval "Average Time On Site"=round(avgd, 0), "Average Page Views"=round(avgc, 0)
| table "Average Time On Site", "Average Page Views"
# using stats
index=web sourcetype=access_combined
| fields JESSIONID
| stats range(_time) as duration, count as eventcount by JESSIONID
| where duration > 0
| stats avg(duration) as avgd, avg(eventcount) as avgc
| eval "Average Time On Site"=round(avgd, 0), "Average Page Views"=round(avgc, 0)
| table "Average Time On Site", "Average Page Views"
transaction
is a very expensive operation. To build a transaction, all events data is searched and required, and is done on the Search Head. Some use cases can also be achieved by stats
command.
union¶
Combine search results from two or more datasets into one. Datasets can be: saved searches, inputlookups, data models, subsearches.
Example
| union
datamoel:Buttercup_Games_Online_Sales.successful_purchase,
[search index=sales sourcetype=vendor_sales]
| table sourcetype, product_name
See also join
untable¶
Does the opposite of xyseries
, for which it takes chartable and tabular data then format it similar to stats output.
Good when need to further extract from, and manipulate values after using commands that result in tabular data.
Syntax | untable row-field, label-field, data-field
produces 3-column results
Example
index=sales sourcetype=vendor_sales (VendorID >= 9000)
| chart count as prod_count by product_name, VendorCountry limit=5 useother=f
| untable product_name, VendorCountry, prod_count
| eventstats max(prod_count) as max by VendorCountry
| where prod_count=max
| stats list(product_name), list(prod_count) by VendorCountry
where¶
Filter events to ONLY keep the results that evaluate as true.
Try use search
whenever you can rather than where
as it is more efficient this way.
Example
index=network sourcetype=cisco_wsa_squid usage=Violation
| stats count(eval(usage="Personal")) as Personal, count(eval(usage="Business")) as Business by username
| where Personal > Business AND username!="lsagers"
| sort -Personal
Supported Functions see docs page. where
shared many of the eval
Functions.
xyseries¶
Convert statistical results into a tabular format suitable for visualizations. Useful for additional process after initial statistical data is returned.
Syntax | xyseries row-field, column-field, data-field
produces n+1 column result, where n is number of distinct values in column-field.
Example
index=web sourcetype=access_combined
| bin _time span=1h
| stats sum(bytes) as totalBytes by _time, host
| eval MB = round(totalBytes/(1024*1024),2)
| xyseries _time, host, MB
# transforms
_time|host|MB
-----|----|--
12:00|www1|12
12:00|www2|11
12:00|www3|7
...|...|...
# into
_time|www1|www2|www3
-----|----|----|----
12:00|12|11|7
...|...|...|...
See also untable
command.
Transforming Commands¶
chart¶
Organize the data as a regular or two-way chart table. To get a two-way table, you need to provide three fields used as table row value, table column value, and table data value. A common pattern is chart <function(data_field)> over <row_field> by <column_field>
chart
is limited to display 10 columns by default, others will show up as an "other" column in the chart visualization. The 'other' column can be turned off by passing an argument useother=f
.
Use limit=5
to control the max number of columns to display, or limit=0
to display all columns.
Example
index=web sourcetype=access_combined status> 299
| chart count over status by host
# if more than one field is supplied to `by` clause without `over` clause
# the first field is used as with a `over` clause
# this has the same effect as above command
index=web sourcetype=access_combined status> 299
| chart count by status, host
Supported Functions see docs page
fieldsummary¶
Calculate summary statistics for fields in your events.
For example, given a search index=web | fieldsummary
it gives insights of the "count, distinct_count, is_exact, min/max/mean/stdev, numberic_count, values" for each field.
If field names are provided, then only do summary for those fields provided.
rare¶
Shows the least common values of a field set. It is the opposite of top
Command and accepts the same set of clauses.
Example
index=sales sourcetype=vendor_sales
| rare Vendor limit=5 countfield="Number of Sales" showperc=False useother=True
stats¶
Produce statistics of our search results and need to use functions to produce stats.
The eval
Function is handy to apply quick and concise conditional filtering to limit the statistics calculated from desired data.
Example
index=sales sourcetype=vendor_sales
| stats count as "Total Sells By Vendors" by product_name, categoryid
# apply conditions using eval Function
index=web sourcetype=access_combined
| stats count(eval(status<300)) as "Success", ...
Supported Functions see docs page
timechart¶
Performs stats aggregations against time, and time is always the x-axis. Like chart
, except only one value can be supplied to the by
clause
This command determines the time intervals from the time range selected, and it can be changed by using an argument span
.
Example
index=web sourcetype=vendor_sales
| timechart count by product_name span=1m
Supported Functions see docs page
timewrap¶
Compare the data from timechart
further over an older time range.
Specify a time period to apply to the result of a timechart command, then display series of data based on this time periods, with the X axis display the increments of this period and the Y axis display the aggregated values over that period.
Example
index=sales sourcetype=vendor_sales product_name="Dream Crusher"
| timechart span=1d sum(price) by product_name
| timewrap 7d
| rename _time as Day
| eval Day=strftime(Day, "%A")
top¶
Finds the most common values of given field(s) in a result set and automatically returns count and percent columns. By default displays top 10 and the count can be set with limit=n
; while limit=0
yields all results.
Top Command Clauses: limit=int countfield=string percentfield=string showcount=True/False showperc=True/False showother=True/False otherstr=string
Example
index=sales sourcetype=vendor_sales
| top Vendor product_name limit=5
# top command also supports results grouping by fields
index=sales sourcetype=vendor_sales
| top product_name by Vendor limit=3 countfield="Number of Sales" showperc=False
Generating Commands¶
Commands that generates events.
datamodel¶
Display the structure of data models and search against them.
It accepts first argument as datamodel name and the second argument as the object name under than model. Add search
to enable search on the data model data.
Example
| datamodel Buttercup_Games_Online_Sales http_request search
| search http_request.action=purchase
inputlookup¶
Return values from a lookup table.
Useful to use it in a subsearch to narrow down the set of events to search based on the values in the lookup table.
Example
index=sales sourcetype=vendor_sales
[| inputlookup API_Tokens
| table VendorID ]
# only take one column to use as filter; its values will be OR-ed together
# can add 'NOT before the subsearch to exclude those vendor ids
| top Vendor product_name limit=5
makeresults¶
Creates a defined number of search results, good for creating sample data for testing searches or building values to be used in searches.
Example
# must starts search with pipe
| makeresults
| eval tomorrow=relative_time(now(), "+1d"),
tomorrow=strftime(tomorrow, "%A"),
result=if(tomorrow="Saturday" OR tomorrow="Sunday", "Huzzah!", "Boo!"),
msg=printf("Tomorrow is %s, %s", tomorrow, result)
tstats¶
Get statistical info from tsidx files.
Splunk will do full search if the search ran is outside of the summary range of the accelerated data model. Limit search to summary range with summariesonly
argument.
Example
| tstats count as "count" from datamodel=linux_server_access where web_server_access.src_ip=10.* web_server_access.action=failure by web_server_access.src_ip, web_server_access.user summariesonly=t span=1d
| where count > 300
| sort -count
Special Commands¶
Splunk has commands to extract geographical info from data and display them in a good format.
gauge¶
Show a field value in a gauge.
Example
index=sales sourcetype=vendor_sales
| stats sum(sale) as total
| gauge total 0 3000 6000 7000
geom¶
Adds fields with geographical data structures matching polygons on a choropleth map visulization.
Example
index=sales sourcetype=vendor_sales
| stats count as Sales by VendorCountry
| geom geo_countries featureIdField=VendorCountry
# the field `VendorCountry` must map back a country name in the featureCollection
geostats¶
Aggregates geographical data for use on a map visualization.
Example
index=sales sourcetype=vendor_sales
| geostats latfield=VendorLatitude longfield=VendorLongitude count by product_name globallimit=4
iplocation¶
Lookup IP address and add location information to events. Data like city, country, region, latitude, and longitude
can be added to events that include external IP addresses. Not all location info might be available depends on the IP.
Example
index=web sourcetype=access_combined action=purchase status=200
| iplocation clientip
trendline¶
Compute moving averages of field values, gives a good understanding of how the data is trending over time.
Three trendtypes (used as Functions by trendline
command):
- simple moving average (
sma
): compute the sum of data points over a period of time - expoential moving average (
ema
): assigns a heavier weighting to more current data points - weighted moving average (
wma
): assigns a heavier weighting to more current data points
Example
index=web sourcetype=access_combined
| timechart sum(price) as sales
| trendline wma2(sales) as trend ema10(bars)
# '2' here means do calculation per 2 events
# this number can be between 2 and 10000
SPL Best Practices¶
'key!=value' vs 'NOT key=value'
A small difference: key!=value
only includes entries where key is not NULL, while NOT key=value
includes entries even when the key does NOT exitst on those events entries.
use the 'IN' operator
Instead of doing (key=value1 OR key=value2 OR key=value3)
, more viewable syntax: key IN ("value1", "value2", "value3")
use 'index' 'source' 'host' early
These fields are extracted when data was indexed and stored and won't take time to extract at search time. It is best to use these fields early in the search to limit the data to be further processed.
use 'inclusion' over 'exclusion'
Searching for exactly "something" is better than searching for "NOT something"
specify time in the query when appropriate
Use relative time earliest=-2h latest=-1h
or absolute time earliest=01/08/2018:12:00:00
in the search. Use @
to round down time to the nearest 0 in the unit used earliest=-2@h
time is the most efficient way to improve queries search time
Splunk stores data in buckets (directories) containing raw data and indexing data, and buckets have maximum size and maximum time span.
Three kinds of searchable buckets: hot, warm, cold. Access speed: hot (read/write, very fast, like memory) > warm (slow, read-only, slower medium) > cold (very slow, read-only, cheap and stable medium).
When search is run, it go look for the right bucket to open, uncompress the raw data and search the contents inside.
The order of effectiveness in filtering data: time > index > source > host > sourcetype
create different indexes to group data
Having specialized indexes helps make Splunk searches faster and more efficient.
be careful when using wildcards
Using wildcard at the beginning of a word cause Splunk to search all events which causes degradation in performance.
Using wildcard in the middle of a string might cause inconsistent results due to the way Splunk indexes data that contains punctuation
Best to make search as specific as possible, and only use wildcards at the end of a word, say status=failure
instead of status=fail*
save search as reports using fast mode
Fast mode emphasis on performance and returns only essential data.
Verbose mode emphasizes completeness and returns all fields and event data.
Smart mode weights tradeoffs and returns the best results for the search being run.
use 'fields' command to extract only the fields you need
Doing so as early in the search as possible will let each indexer extract less fields, pack and return data back faster.
consult Job Inspector to know the performance of the searches
It tells you which phase of the search took the most time. Any search that has not expired can be inspected by this tool. Read more about Job Inspector
avoid using subsearches when possible
Compare the follow two queries, which returns the same results but the second one is significantly faster:
index=security sourcetype=winauthentication_security (EventCode=4624 OR EventCode=540) NOT
[search sourcetype=history_access
| rename Username as User
| fields User]
| stats count by User
| table User
index=security (sourcetype=winauthentication_security (EventCode=4624 OR EventCode=540)) OR (sourcetype=history_access)
| eval badge_access=if(sourcetype="history_access", 1, 0)
| stats max(badge_access) as badged_in by Username
| where badged_in=0
| fields Username
SPL Tricks¶
let subsearch limit main search's time range
Let the subsearch output a table with a single row and columns earliest
and latest
use Functions count and list in conjunction make readable results
list
returns a list of values of a field as a multi-value result (will be put in one cell in the result table), by default up to 100 values
index=web sourcetype=access_combined action=purchase status=200
| stats count by host, product_name
| sort -count
| stats list(product_name) as "Product Name", list(count) as Count, sum(count) as total by host
| sort -total
| fields - total
use Functions strftime, strptime, relative_time to extract/convert time
# convert a time value into its timestamp representation
index=manufacture sourcetype=3dPrinterData
| eval boot_ts=strptime(boot_time, "%b/%d/%y %H:%M:%S"), days_since_boot=round((now()-boot_ts)/86400)
| stats values(days_since_boot) as "uptime_days" by printer_name
# extract parts from the time field
sourcetype=foo
| eval date_hour=strftime(_time, "%H")
| eval date_wday=strftime(_time, "%w")
| search date_hour>=9 date_hour<=18 date_wday>=1 date_wday<=5
# push time forward or backwards from a timestamp
index=manufacture sourcetype=3dPrinterData
| eval boot_ts=strptime(boot_time, "%b/%d/%y %H:%M:%S"), rt=relative_time(boot_ts, "+30d"), reboot=strftime(rt, "%x")
| stats values(reboot) as "day_to_reboot" by printer_name
A list of time format codes can be found here
use Functions lower/upper, substr, replace, len, for text manipulation
index=hr
| eval Organization=lower(Organization), Is_In_Marketing=if(Organization=="marketing", "Yes", "No")
| table Name, Organization, Is_In_Marketing
index=hr
| eval Group=substr(Employee_ID, 1, 2), Location=substr(Employee_ID, -2)
| table Name, Group, Location
index=hr
| eval Employee_Details=replace(Employee_ID, "^([A-Z]+)_(\d+)_([A-Z]+)", "Employee #\2 is a \1 in \3")
create empty column to help display data compared in bar chart
index=security sourcetype=linux_secure "fail*" earliest=-31d@d latest=-1d@d
| timechart count as daily_total
| stats avg(daily_total) as DailyAvg
| appendcols
[search index=security sourcetype=linux_secure "fail*" earliest=-1d@d latest=@d
| stats count as Yesterday]
| eval Averages=""
| stats list(DailyAvg) as DailyAvg, list(Yesterday) as Yesterday by Averages
swap row and column fields using untable and xyseries Commands
index=web sourcetype=access_combined action=purchase
| chart sum(price) by product_name, clientip limit=0
| addtotals
| sort 5 Total
| fields - Total
| untable product_name, clientip, count
| xyseries clientip, product_name, count # it took these commands to swap row and column fields!
| addtotals
| sort 3 -Total
| fields - Total
aggregate for two different time ranges and compare on the same visualization
index=security sourcetype=linux_secure "failed password" earliest=-30d@d latest=@h
| eval Previous24h=relative_time(now(), "-24h@h"), Series=if(_time>=Previous24h, "last24h", "prior")
| timechart count span=1h by Series
| eval Hour=strftime(_time, "%H"), currentHour=strftime(now(), "%H"), offset=case(Hour<=currentHour, Hour-currentHour, Hour>currentHour, (Hour-24)-currentHour)
| stats avg(prior) as "30 Day Average for Hour", sum(last24) as "Last 24 Hours" by Hour, offset
| sort offset
| fields - offset
| rename Hour as "Hour of Day"
aggregate for certain time window each day
You can do this with stats
with more control and achieve what timechart
provides and more!
index=web sourcetype=access_combined action=purchase earliest=-3d@d latest=@d date_hour>=0 AND date_hour<6
| bin span=1h _time
| stats sum(price) as hourlySales by _time
| eval Hour=strftime(_time, "%b %d, %I %p"), "Hourly Sales"="$".tostring(hourlySales)
| table Hour, "Hourly Sales"
Reports and Dashboards¶
Reports¶
Reports allow people to easily store and share search results and queries used to make the search.
When a report is run, a fresh search is run. Save report results to speed up search see Accelerate Reports Doc
Scheduled Reports and Alerts¶
Scheduled Reports can do weekly/monthly reports and automatically send results via emails.
Select a saved report and add a schedule on it. Only admin can set its priority.
Alerts are based on searches that run on scheduled intervals or in real time (by admin). It is triggered when the results of a search meet defined conditions.
Some alerts actions:
- create a log file with the events
- output to lookup, to apend or replace data in a lookup table
- send to a telemetry endpoint, call an endpoint
- trigger scripts, triggers a bash script stored on the machine
- send emails or Slack notifications
- use a webhook
Visulizations¶
Any searches that returns statistics information can be presented as visualizations, or charts. Charts can be based on numbers, time, and location. Save visualizations as a report or a dashboard panel.
Creating customized visualizations see doc page
Visulizations that are availble by default and are very intuitive (bar, line, area, pie, and table charts) are not explained here.
Choropleth Maps¶
Uses colored shadings to show differences in numbers over geographical locations. It requires a compressed Keyhole Markup Language (KMZ) file that defines region boundries.
Splunk ships with two KMZ files, geo_us_states for US, and geo_countries for the countries of the world. Other KMZ files can be provided and used.
geom
is the command to use to show choropleth map. See geom command.
Single Value¶
Displays a single number with formatting options to add caption, color, unit, etc. You can use the timechart
command to add a trend and a sparkline for that value.
It can also display as gauges which can take forms of radial, filler, or marker. There are options to format ranges and color.
gauge
is the command to use to enable and pass in ranges. See gauge command.
Dashboards¶
Dashboard is a collection of reports or visualizations.
Dashboarding Tricks¶
let panels listen on variables
Use $variable_name$
to refer to another variable that can be controlled by a dropdown or the drilldown
'earliest' and 'latest' from time picker
For a time picker variable, the earliest and latest time can be accessed through $variable_name$.earliest
and $variable_name$.latest
Accelerations¶
Splunk allows creation of summary of event data, as smaller segments of event data that only include those info needed to fullfill the search.
Report acceleration¶
Uses automatically created summaries, accelerates individual reports.
- acceleration summary basically are stored results populated every 10 minutes
- searches must be in Fast/Smart mode
- must include a transforming command, and having streaming command before it and non-streaming commands after
- also stored as file chunks alongside indexes buckets
- must re-build when associated knowledge objects change
Summary indexing¶
Uses manually created summaries, indexes separated from deployment.
- must use 'si'-prefixed commands such as
sichart
,sistats
- search a Summary Index by using
index=<summary_index_group_name> | report=<summary_index_name>
- need to pay attention to avoid creating gaps and overlaps in the data
Data model acceleration¶
Accelerates all fields in a data model, easiest and most efficient option.
- adhoc acceleration - summary-creation happens automatically on data models that have not been accelerated; summary files created stays on the search head for the period that user is actively using the Pivot tool
- persistent acceleration - summary files created are stored alongside the data buckets and exits as long as the data model exists
- data models will become read-only after the acceleration
- requires searches use only streaming commands
datamodel
command allows display the structure of data models and search against them.
Time-Series Index Files (tsidx files) are files created for data model acceleration.
- tsidx components:
- lexicon - an alphanumerically ordered list of terms found in data at index-time
- fields extracted at index-time showup as
key-value
pair in lexicon - Splunk searches lexicon first and only open and read raw event data matching the terms using the pointers
- fields extracted at index-time showup as
- posting list - array of pointers that match each term to events in the raw data files
- lexicon - an alphanumerically ordered list of terms found in data at index-time
- building of tsidx files ALSO happen when a data model is accelerated
- for persistent acceleration, updated tsidx every 5 min and remove-outdated after 30 min
tstats
command is for getting statistical info from tsidx files