Search Command> stats, eventstats and streamstats

Getting started with stats, eventstats and streamstats

When I first joined Splunk, like many newbies I needed direction on where to start. Someone gave me some excellent advice:

“Learn the stats and eval commands.”

Putting eval aside for another blog post, let’s examine the stats command. It never ceases to amaze me how many Splunkers are stuck in the “super grep” stage. They just use Splunk to search (happily I might add) for keywords and phrases over many sources of machine data. Hopefully this will help advance some folks beyond “super grep” as well as assist those who may be new to Splunk.

When you dive into Splunk’s excellent documentation, you will find that the stats command has a couple of siblings — eventstats and streamstats. In this blog post, I will attempt, by means of a simple web log example, to illustrate how the variations on the stats command work, and how they are different. Stats typically gets a lot of use, but I’ll use it to set the stage for eventstats and streamstats which don’t get as much use. Reference documentation links are included at the end of the post. I will take a very basic, step-by-step approach by going through what is happening with the stats command, and then expand on that example to show how stats differs from eventstats and streamstats. In an effort to keep it simple, I’ll limit the data of interest to five (5) events with the head command. (If you’re cool with stats, scroll on down to eventstats or streamstats.)

As the name implies, stats is for statistics. Per the Splunk documentation:

Description:
Calculate aggregate statistics over the dataset, similar to SQL aggregation. If called without a by clause, one row is produced, which represents the aggregation over the entire incoming result set. If called with a by-clause, one row is produced for each distinct value of the by-clause.

There are also a number of statistical functions at your disposal, avg() , count() , distinct_count() , median() , perc<int>() , stdev() , sum() , sumsq() , etc. just to name a few.

So let’s look at a simple search command that sums up the number of bytes per IP address from some web logs.

To begin, do a simple search of the web logs in Splunk and look at 5 events and the associated byte count related to two ip addresses in the field clientip.

sourcetype=access_combined* | head 5

Now just like stats there are two values ( one for each clientip ) for ASimpleSumOfBytes, but they are aggregated to the raw events and can be used for later calculation. Just a note, your raw data is untouched. The aggregation is just a presentation feature that you get with eventstats.

If I want to add the bytes field for each of the event along with the summation and the clientip, I can easily create the table that failed with stats. Note that the sum of all the bytes per clientip is included along side each of the original bytes value. As the following search illustrates:

sourcetype=access_combined* | head 5 |sort _time | eventstats sum(bytes) as ASimpleSumOfBytes by clientip | table bytes, ASimpleSumOfBytes, clientip

STREAMSTATS

Having the statistics aggregated onto the original events is great, but what if one is interested in what is happening in a streaming manner, or as Splunk sees the events in time. Streamstats is your command. To help visualize this, I’m sorting time in ascending order. I’ll use the “_time” internal field, and then try out streamstats:

sourcetype=access_combined* | head 5 | sort _time | streamstats sum(bytes) as ASimpleSumOfBytes by clientip

Like eventstats, streamstats aggregates the statistics to the original data, so all of the original data is accessible for further calculations, should we wish. By including time and the original byte count in the table below, we can better see what is going on with the streamstats command.

sourcetype=access_combined* | head 5 |sort _time | streamstats sum(bytes) as ASimpleSumOfBytes by clientip | table _time, clientip, bytes, ASimpleSumOfBytes

As shown in the screen shot above, instead of a total sum for each clientip (as in stats and eventstats), there is a sum for each event as it is seen in time, each one building on the other. Also note that two of these match the total sum calculated in stats and eventstats for each clientip.

The difference here is that the value of the calculated field “ASimpleSumOfBytes ” varies depending on the time that Splunk sees the event at a specific moment. Where is this helpful? As it turns out, lots of questions that folks have about their data concerns what is going on at a specific moment or range in time. Streamstats is extremely useful for this kind of searching and reporting.

Below I have included some links to the Splunk Documentation and Answers communities. Check them out. I hope this has been helpful.

Happy Splunking

References:

About the Splunk search language
http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutthesearchlanguage

Anatomy of a Splunk search
http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutthesearchpipeline#The_anatomy_of_a_search

Answers post to help understand visualizing time in Splunk, related to streamstats
http://answers.splunk.com/answers/105733/streamstats-is-reversed

Splunker finds a cool use for streamstats
http://blogs.splunk.com/2013/10/31/streamstats-example

The stats page in the Splunk docs
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Stats

Functions that work with Stats
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/CommonStatsFunctions

Splunk

The world’s leading organizations trustSplunkto help keep their digital systems secure and reliable. Our software solutions and services help to prevent major issues, absorb shocks and accelerate transformation. Learnwhat Splunk doesandwhy customers choose Splunk.

FAQs

What is the difference between Splunk Eventstats and Streamstats? ›

By analyzing data in real time, Splunk users can quickly identify trends, patterns, and anomalies that would be difficult to detect with traditional analysis methods. eventstats is particularly useful for analyzing historical data, while 'streamstats' is designed for real-time data analysis.

What is the eventstats command in Splunk? ›

The eventstats command looks for events that contain the field that you want to use to generate the aggregation. The command creates a new field in every event and places the aggregation in that field. The aggregation is added to every event, even events that were not used to generate the aggregation.

View Details ›

How do I get stats in Splunk? ›

The SPL2 stats command calculates aggregate statistics, such as average, count, and sum, over the incoming search results set. This is similar to SQL aggregation. If the stats command is used without a BY clause, only one row is returned, which is the aggregation over the entire incoming result set.

How to use top command in Splunk? ›

Top Values for a Field by a Field

Next, we can also include another field as part of this top command's by clause to display the result of field1 for each set of field2. In the below search, we find top 3 productids for each file name.

Get More Info ›

What are the 4 types of searches in Splunk by performance? ›

How search types affect Splunk Enterprise performance

Search type	Ref. indexer throughput	Performance impact
Dense	Up to 50,000 matching events per second.	CPU-bound
Sparse	Up to 5,000 matching events per second.	CPU-bound
Super-sparse	Up to 2 seconds per index bucket.	I/O bound
Rare	From 10 to 50 index buckets per second.	I/O bound

Read The Full Story ›

What is the function of Streamstats in Splunk? ›

The SPL2 streamstats command adds a cumulative statistical value to each search result as each result is processed. For example, you can calculate the running total for a particular field, or compare a value in a search result with a the cumulative value, such as a running average.

What is the limit of eventstats? ›

By default, eventstats can aggregate up to 50,000 events at a time. You can change this limit with the MaxNoOfa*ggregatedEvents parameter.

Get More Info Here ›

What is the difference between stats and transaction commands in Splunk? ›

Stats provides the aggregation. transaction provides the unique number / count. Like you perform 10 steps as part of one transaction.

What is the difference between stats and chart command in Splunk? ›

Note that you can specify any number of "group by" fields to the stats command, whereas the chart/timechart command can only have one "group by" (with timechart it is always _time) and one "split by". This is why our first example was able to incorporate the "host" field easily whereas the second example did not.

Discover More ›

How do I search data in Splunk? ›

To search on a keyword, select the Keyword tab, type the keyword or phrase you want to search on, then press Enter. If you want to search on a field, select the Fields tab, enter the field name, then press Enter. To continue adding keywords or fields to the search, select Add Filter.

What does the stats command do? ›

The stats command is used to calculate summary statistics on the results of a search or the events retrieved from an index. The stats command works on the search results as a whole and returns only the fields that you specify.

See Details ›

What are eventstats in Splunk? ›

The SPL2 eventstats command generates summary statistics from fields in your events and saves those statistics into a new field. The eventstats command places the generated statistics in new field that is added to the original raw events.

Know More ›

What is the rare command in Splunk? ›

Rare Command Syntax:

Where: <field>: Specifies the field to analyze. <limit> (optional): Sets the maximum number of results to display (default is 10). <by> (optional): Specifies a secondary field to further refine the analysis.

What is the most efficient way to limit search results returned in Splunk? ›

One of the most effective ways to limit the data that is pulled off from disk is to limit the time range. Use the time range picker or specify time modifiers in your search to identify the smallest window of time necessary for your search.

Explore More ›

What is the difference between events and statistics in Splunk? ›

The difference is that with the eventstats command aggregation results are added inline to each event and added only if the aggregation is pertinent to that event. let me know if this helps ! stats - Calculates aggregate statistics over the results set, such as average, count, and sum.

What are the three types of Splunk authentication? ›

About user authentication

Scheme	Splunk platform types
Native Splunk authentication	all
Lightweight Directory Access Protocol (LDAP)	all
Security Assertion Markup Language (SAML)	all
Multi-factor authentication	Splunk Enterprise

1 more row

May 1, 2024

What are the different types of indexes in Splunk? ›

There are two types of indexes: Events indexes. Events indexes are the default type of index. They can hold any type of data.

Read The Full Story ›

What is a Splunk streaming command? ›

A streaming command applies a transformation to each event returned by a search. For example, the rex command is streaming because it extracts and adds fields to events at search time.

Search Command&gt; stats, eventstats and streamstats | Splunk (2024)

FAQs

What is the difference between Splunk Eventstats and Streamstats? ›

What does the stats command do? ›

Search Command> stats, eventstats and streamstats | Splunk (2024)