Data models enable you to create Splunk reports and dashboards without having to develop Splunk search. Typically, data models are designed by those that understand the specifics around the format, the semantics of certain data, and the manner in which users may expect to work with that data. In building a typical data model, knowledge managers use knowledge object types (such as lookups, transactions, search-time field extractions, and calculated fields).
Today we are going to learn how to create a Splunk data model and how to describe that model with various fields and lookup attributes.
This article is an excerpt from a book written by James D. Miller titled Implementing Splunk 7 – Third Edition.
Creating a data model
So now that we have a general idea of what a Splunk data model is, let’s go ahead and create one. Before we can get started, we need to verify that our user ID is set up with the proper access required to create a data model. By default, only users with an admin or power role can create data models. For other users, the ability to create a data model depends on whether their roles have write access to an app.
To begin (once you have verified that you do have access to create a data model), you can click on Settings and then on Data models (under KNOWLEDGE):
This takes you to the Data Models (management) page, shown in the next screenshot. This is where a list of data models is displayed. From here, you can manage permissions, acceleration, cloning, and removal of existing data models. You can also use this page to upload a data model or create new data models, using the Upload Data Model and New Data Model buttons on the upper-right corner, respectively.
Since this is a new data model, you can click on the button labeled New Data Model. This will open the New Data Model dialog box (shown in the following image). We can fill in the required information in this dialog box:
Filling in the new data model dialog
You have four fields to fill in order to describe your new Splunk data model (Title, ID, App, and Description):
- Title: Here you must enter a Title for your data model. This field accepts any character, as well as spaces. The value you enter here is what will appear on the data model listing page.
- ID: This is an optional field. It gets prepopulated with what you entered for your data model title (with any spaces replaced with underscores. Take a minute to make sure you have a good one, since once you enter the data model ID, you can’t change it.
- App: Here you select (from a drop-down list) the Splunk app that your data model will serve.
- Description: The description is also an optional field, but I recommend adding something descriptive to later identify your data model.
Once you have filled in these fields, you can click on the button labeled Create. This opens the data model (in our example, Aviation Games) in the Splunk Edit Objects page as shown in the following screenshot:
The next step in defining a data model is to add the first object. As we have already stated, data models are typically composed of object hierarchies built on root event objects. Each root event object represents a set of data that is defined by a constraint, which is a simple search that filters out events that are not relevant to the object.
Getting back to our example, let’s create an object for our data model to track purchase requests on our Aviation Games website.
To define our first event-based object, click on Add Dataset (as shown in the following screenshot):
Our data model’s first object can either be a Root Event, or Root Search. We’re going to add a Root Event, so select Root Event. This will take you to the Add Event Dataset editor:
Our example event will expose events that contain the phrase error, which represents processing errors that have occurred within our data source. So, for Dataset Name, we will enter Processing Errors.
The Dataset ID will automatically populate when you type in the Dataset Name (you can edit it if you want to change it). For our object’s constraint, we’ll enter sourcetype=tm1* error. This constraint defines the events that will be reported on (all events that contain the phrase error that are indexed in the data sources starting with tml). After providing Constraints for the event-based object, you can click on Preview to test whether the constraints you’ve supplied return the kind of events that you want.
The following screenshot depicts the preview of the constraints given in this example:
After reviewing the output, click on Save. The list of attributes for our root object is displayed: host, source, sourcetype, and _time. If you want to add child objects to client and server errors, you need to edit the attributes list to include additional attributes:
Editing fields (attributes)
Let’s add an auto-extracted attribute, as mentioned earlier in this chapter, to our data model. Remember, auto-extracted attributes are derived by Splunk at search time. To start, click on Add Field:
Next, select Auto-Extracted. The Add Auto-Extracted Field window opens:
You can scroll through the list of automatically extracted fields and check the fields that you want to include. Since my data model example deals with errors that occurred, I’ve selected date_mday, date_month, and date_year.
Notice that to the right of the field list, you have the opportunity to rename and type set each of the fields that you selected. Rename is self-explanatory, but for Type, Splunk allows you to select String, Number, Boolean, or IPV$ and indicate if the attribute is Required, Optional, Hidden, or Hidden & Required. Optional means that the attribute doesn’t have to appear in every event represented by the object. The attribute may appear in some of the object events and not others.
Once you have reviewed your selected field types, click on Save:
Lookup attributes
Let’s discuss lookup attributes now. Splunk can use the existing lookup definitions to match the values of an attribute that you select to values of a field in the specified lookup table. It then returns the corresponding field/value combinations and applies them to your object as (lookup) attributes.
Once again, if you click on Add Field and select Lookup, Splunk opens the Add Fields with a Lookup page (shown in the following screenshot) where you can select from your currently defined lookup definitions. For this example, we select dnslookup:
The dnslookup converts clienthost to clientip. We can configure a lookup attribute using this lookup to add that result to the processing errors objects.
Under Input, select clienthost for Field in Lookup and Field in Dataset. Field in Lookup is the field to be used in the lookup table. Field in Dataset is the name of the field used in the event data. In our simple example, Splunk will match the field clienthost with the field host:
Under Output, I have selected host as the output field to be matched with the lookup. You can provide a Display Name for the selected field. This display name is the name used for the field in your events. I simply typed AviationLookupName for my display name (see the following screenshot):
Again, Splunk allows you to click on Preview to review the fields that you want to add. You can use the tabs to view the Events in a table, or view the values of each of the fields that you selected in Output. For example, the following screenshot shows the values of AviationLookupName:
Finally, we can click on Save:
Add Child object to our model
We have just added a root (or parent) object to our data model. The next step is to add some children. Although a child object inherits all the constraints and attributes from its parent, when you create a child, you will give it additional constraints with the intention of further filtering the dataset that the object represents.
To add a child object to our data model, click on Add Field and select Child:
Splunk then opens the editor window, Add Child Dataset (shown in the following screenshot):
On this page, follow these steps:
- Enter the Object Name: Dimensional Errors.
- Leave the Object ID as it is: Dimensional_Errors.
- Under Inherit From, select Processing Errors. This means that this child object will inherit all the attributes from the parent object, Processing Errors.
- Add the Additional Constraints, dimension, which means that the data models search for the events in this object; when expanded, it will look something like sourcetype=tm1* error dimension.
- Finally, click on Save to save your changes:
Following the previously outlined steps, you can add more objects, each continuing to filter the results until you have the results that you need.
With this we learned to create data models, and manage permissions, cloning and accelerating operational data models with ease. If you found this tutorial useful, do check out the book Implementing Splunk 7 – Third Edition and start transforming machine-generated data into valuable and actionable business insights.
Read Next:
How to use R to boost your Data Model