11 min read

In this article by Josh Diakun, Paul R Johnson, and Derek Mock authors of the books Splunk Operational Intelligence Cookbook – Second Edition, we will cover the basic ways to search the data in Splunk. We will cover how to make raw event data readable

(For more resources related to this topic, see here.)

The ability to search machine data is one of Splunk’s core functions, and it should come as no surprise that many other features and functions of Splunk are heavily driven-off searches. Everything from basic reports and dashboards to data models and fully featured Splunk applications are powered by Splunk searches behind the scenes.

Splunk has its own search language known as the Search Processing Language (SPL). This SPL contains hundreds of search commands, most of which also have several functions, arguments, and clauses. While a basic understanding of SPL is required in order to effectively search your data in Splunk, you are not expected to know all the commands! Even the most seasoned ninjas do not know all the commands and regularly refer to the Splunk manuals, website, or Splunk Answers (http://answers.splunk.com).

To get you on your way with SPL, be sure to check out the search command cheat sheet and download the handy quick reference guide available at http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/SplunkEnterpriseQuickReferenceGuide.

Searching

Searches in Splunk usually start with a base search, followed by a number of commands that are delimited by one or more pipe (|) characters. The result of a command or search to the left of the pipe is used as the input for the next command to the right of the pipe. Multiple pipes are often found in a Splunk search to continually refine data results as needed. As we go through this article, this concept will become very familiar to you.

Splunk Operational Intelligence Cookbook - Second Edition

Splunk allows you to search for anything that might be found in your log data. For example, the most basic search in Splunk might be a search for a keyword such as error or an IP address such as 10.10.12.150. However, searching for a single word or IP over the terabytes of data that might potentially be in Splunk is not very efficient. Therefore, we can use the SPL and a number of Splunk commands to really refine our searches. The more refined and granular the search, the faster the time to run and the quicker you get to the data you are looking for!

When searching in Splunk, try to filter as much as possible before the first pipe (|) character, as this will save CPU and disk I/O. Also, pick your time range wisely. Often, it helps to run the search over a small time range when testing it and then extend the range once the search provides what you need.

Boolean operators

There are three different types of Boolean operators available in Splunk. These are AND, OR, and NOT. Case sensitivity is important here, and these operators must be in uppercase to be recognized by Splunk. The AND operator is implied by default and is not needed, but does no harm if used.

For example, searching for the term error or success would return all the events that contain either the word error or the word success. Searching for error success would return all the events that contain the words error and success. Another way to write this can be error AND success. Searching web access logs for error OR success NOT mozilla would return all the events that contain either the word error or success, but not those events that also contain the word mozilla.

Common commands

There are many commands in Splunk that you will likely use on a daily basis when searching data within Splunk. These common commands are outlined in the following table:

Command

Description

chart/timechart

This command outputs results in a tabular and/or time-based output for use by Splunk charts.

dedup

This command de-duplicates results based upon specified fields, keeping the most recent match.

eval

This command evaluates new or existing fields and values. There are many different functions available for eval.

fields

This command specifies the fields to keep or remove in search results.

head

This command keeps the first X (as specified) rows of results.

lookup

This command looks up fields against an external source or list, to return additional field values.

rare

This command identifies the least common values of a field.

rename

This command renames the fields.

replace

This command replaces the values of fields with another value.

search

This command permits subsequent searching and filtering of results.

sort

This command sorts results in either ascending or descending order.

stats

This command performs statistical operations on the results. There are many different functions available for stats.

table

This command formats the results into a tabular output.

tail

This command keeps only the last X (as specified) rows of results.

top

This command identifies the most common values of a field.

transaction

This command merges events into a single event based upon a common transaction identifier.

Time modifiers

The drop-down time range picker in the Graphical User Interface (GUI) to the right of the Splunk search bar allows users to select from a number of different preset and custom time ranges. However, in addition to using the GUI, you can also specify time ranges directly in your search string using the earliest and latest time modifiers. When a time modifier is used in this way, it automatically overrides any time range that might be set in the GUI time range picker.

The earliest and latest time modifiers can accept a number of different time units: seconds (s), minutes (m), hours (h), days (d), weeks (w), months (mon), quarters (q), and years (y). Time modifiers can also make use of the @ symbol to round down and snap to a specified time.

For example, searching for sourcetype=access_combined earliest=-1d@d latest=-1h will search all the access_combined events from midnight, a day ago until an hour ago from now. Note that the snap (@) will round down such that if it were 12 p.m. now, we would be searching from midnight a day and a half ago until 11 a.m. today.

Working with fields

Fields in Splunk can be thought of as keywords that have one or more values. These fields are fully searchable by Splunk. At a minimum, every data source that comes into Splunk will have the source, host, index, and sourcetype fields, but some source might have hundreds of additional fields. If the raw log data contains key-value pairs or is in a structured format such as JSON or XML, then Splunk will automatically extract the fields and make them searchable. Splunk can also be told how to extract fields from the raw log data in the backend props.conf and transforms.conf configuration files.

Searching for specific field values is simple. For example, sourcetype=access_combined status!=200 will search for events with a sourcetype field value of access_combined that has a status field with a value other than 200.

Splunk has a number of built-in pre-trained sourcetypes that ship with Splunk Enterprise that might work with out-of-the-box, common data sources. These are available at http://docs.splunk.com/Documentation/Splunk/latest/Data/Listofpretrainedsourcetypes.

In addition, Technical Add-Ons (TAs), which contain event types and field extractions for many other common data sources such as Windows events, are available from the Splunk app store at https://splunkbase.splunk.com.

Saving searches

Once you have written a nice search in Splunk, you may wish to save the search so that you can use it again at a later date or use it for a dashboard. Saved searches in Splunk are known as Reports. To save a search in Splunk, you simply click on the Save As button on the top right-hand side of the main search bar and select Report.

Splunk Operational Intelligence Cookbook - Second Edition

Making raw event data readable

When a basic search is executed in Splunk from the search bar, the search results are displayed in a raw event format by default. To many users, this raw event information is not particularly readable, and valuable information is often clouded by other less valuable data within the event. Additionally, if the events span several lines, only a few events can be seen on the screen at any one time.

In this recipe, we will write a Splunk search to demonstrate how we can leverage Splunk commands to make raw event data readable, tabulating events and displaying only the fields we are interested in.

Getting ready

You should be familiar with the Splunk search bar and search results area.

How to do it…

Follow the given steps to search and tabulate the selected event data:

  1. Log in to your Splunk server.
  2. Select the Search & Reporting application from the drop-down menu located in the top left-hand side of the screen.

    Splunk Operational Intelligence Cookbook - Second Edition

  3. Set the time range picker to Last 24 hours and type the following search into the Splunk search bar:
    index=main sourcetype=access_combined

    Then, click on Search or hit Enter.

    Splunk Operational Intelligence Cookbook - Second Edition

  4. Splunk will return the results of the search and display the raw search events under the search bar.
  5. Let’s rerun the search, but this time we will add the table command as follows:
    index=main sourcetype=access_combined | table _time, referer_domain, method, uri_path, status, JSESSIONID, useragent
    
  6. Splunk will now return the same number of events, but instead of presenting the raw events to you, the data will be in a nicely formatted table, displaying only the fields we specified. This is much easier to read!

    Splunk Operational Intelligence Cookbook - Second Edition

  7. Save this search by clicking on Save As and then on Report. Give the report the name cp02_tabulated_webaccess_logs and click on Save. On the next screen, click on Continue Editing to return to the search.

    Splunk Operational Intelligence Cookbook - Second Edition

How it works…

Let’s break down the search piece by piece:

Search fragment

Description

index=main

All the data in Splunk is held in one or more indexes. While not strictly necessary, it is a good practice to specify the index (es) to search, as this will ensure a more precise search.

sourcetype=access_combined

This tells Splunk to search only the data associated with the access_combined sourcetype, which, in our case, is the web access logs.

| table _time, referer_domain, method, uri_path, action, JSESSIONID, useragent

Using the table command, we take the result of our search to the left of the pipe and tell Splunk to return the data in a tabular format. Splunk will only display the fields specified after the table command in the table of results.

 In this recipe, you used the table command. The table command can have a noticeable performance impact on large searches. It should be used towards the end of a search, once all the other processing on the data by the other Splunk commands has been performed.

The stats command is more efficient than the table command and should be used in place of table where possible. However, be aware that stats and table are two very different commands.

There’s more…

The table command is very useful in situations where we wish to present data in a readable format. Additionally, tabulated data in Splunk can be downloaded as a CSV file, which many users find useful for offline processing in spreadsheet software or for sending to others. There are some other ways we can leverage the table command to make our raw event data readable.

Tabulating every field

Often, there are situations where we want to present every event within the data in a tabular format, without having to specify each field one by one. To do this, we simply use a wildcard (*) character as follows:

index=main sourcetype=access_combined | table *

Removing fields, then tabulating everything else

While tabulating every field using the wildcard (*) character is useful, you will notice that there are a number of Splunk internal fields, such as _raw, that appear in the table. We can use the fields command before the table command to remove the fields as follows:

index=main sourcetype=access_combined | fields - sourcetype, index, _raw, source date* linecount punct host time* eventtype | table *

If we do not include the minus () character after the fields command, Splunk will keep the specified fields and remove all the other fields.

Summary

In this article we covered along with the introduction to Splunk, how to make raw event data readable

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here