Splunk provides binary distributions for Windows and a variety of Unix operating systems. For all Unix operating systems, a compressed .tar file is provided. For some platforms, packages are also provided.
This article is an excerpt taken from the book Implementing Splunk 7 – Third Edition written by James Miller. This book covers the new modules of Splunk: Splunk Cloud and the Machine Learning Toolkit to ease data usage and more.
In this tutorial, you will learn how to deploy Splunk library effectively within your system. It also includes how to set up configuration distributions in Splunk.
If your organization uses packages, such as deb or rpm, you should be able to use the provided packages in your normal deployment process. Otherwise, installation starts by unpacking the provided tar to the location of your choice.
The process is the same, whether you are installing the full version of Splunk or the Splunk universal forwarder.
The typical installation process involves the following steps:
- Installing the binary
- Adding a base configuration
- Configuring Splunk to launch at boot
- Restarting Splunk
Having worked with many different companies over the years, I can honestly say that none of them used the same product or even methodology for deploying software. Splunk takes a hands-off approach to fit in as easily as possible into customer workflows.
Deploying from a tar file
To deploy from a tar file, the command depends on your version of tar. With a modern version of tar, you can run the following command:
tar xvzf splunk-7.0.x-xxx-Linux-xxx.tgz
Older versions may not handle gzip files directly, so you may have to run the following command:
gunzip -c splunk-7.0.x-xxx-Linux-xxx.tgz | tar xvf -
This will expand into the current directory. To expand into a specific directory, you can usually add -C, depending on the version of TAR, as follows:
tar -C /opt/ -xvzf splunk-7.0.x-xxx-Linux-xxx.tgz
Deploying using msiexec
In Windows, it is possible to deploy Splunk using msiexec. This makes it much easier to automate deployment on a large number of machines. To install silently, you can use the combination of AGREETOLICENSE and /quiet, as follows:
msiexec.exe /i splunk-xxx.msi AGREETOLICENSE=Yes /quiet
If you plan to use a deployment server, you can specify the following value:
msiexec.exe /i splunk-xxx.msi AGREETOLICENSE=Yes DEPLOYMENT_SERVER="deployment_server_name:8089" /quiet
Or, if you plan to overlay an app that contains deploymentclient.conf, you can forego starting Splunk until that app has been copied into place, as follows:
msiexec.exe /i splunk-xxx.msi AGREETOLICENSE=Yes LAUNCHSPLUNK=0 /quiet
There are options available to start reading data immediately, but I would advise deploying input configurations to your servers, instead of enabling inputs via installation arguments.
Adding a base configuration
If you are using the Splunk deployment server, this is the time to set up deploymentclient.conf. This can be accomplished in several ways, as follows:
- On the command line, by running the following code:
$SPLUNK_HOME/bin/splunk set deploy-poll deployment_server_name:8089
- By placing a deploymentclient.conf in:
- By placing an app containing deploymentclient.conf in:
The third option is what I would recommend because it allows overriding this configuration, via a deployment server, at a later time. We will work through an example later in the Using Splunk deployment server section.
If you are deploying configurations in some other way, for instance with puppet, be sure to restart the Splunk forwarder processes after deploying the new configuration.
Configuring Splunk to launch at boot
On Windows machines, Splunk is installed as a service that will start after installation and on reboot.
On Unix hosts, the Splunk command line provides a way to create startup scripts appropriate for the operating system that you are using. The command looks like this:
$SPLUNK_HOME/bin/splunk enable boot-start
To run Splunk as another user, provide the flag -user, as follows:
$SPLUNK_HOME/bin/splunk enable boot-start -user splunkuser
The startup command must still be run as root, but the startup script will be modified to run as the user provided.
If you do not run Splunk as root, and you shouldn’t if you can avoid it, be sure that the Splunk installation and data directories are owned by the user specified in the enable boot-start command. You can ensure this by using chmod, such as in chmod -R splunkuser $SPLUNK_HOME
On Linux, you could then start the command using service splunk start.
Configuration distribution in Splunk
As we have covered, in some depth, configurations in Splunk are simply directories of plain text files. Distribution essentially consists of copying these configurations to the appropriate machines and restarting the instances. You can either use your own system for distribution, such as puppet or simply a set of scripts, or use the deployment server included with Splunk.
Using your own deployment system
The advantage of using your own system is that you already know how to use it.
Assuming that you have normalized your apps, as described in the section Using apps to organize configuration, deploying apps to a forwarder or indexer consists of the following steps:
- Set aside the existing apps at $SPLUNK_HOME/etc/apps/.
- Copy the apps into $SPLUNK_HOME/etc/apps/.
- Restart Splunk forwarder. Note that this needs to be done as the user that is running Splunk, either by calling the service script or calling su. In Windows, restart the splunkd service.
Assuming that you already have a system for managing configurations, that’s it.
If you are deploying configurations to indexers, be sure to only deploy the configurations when downtime is acceptable, as you will need to restart the indexers to load the new configurations, ideally in a rolling manner.
Do not deploy configurations until you are ready to restart, as some (but not all) configurations will take effect immediately.
Using the Splunk deployment server
If you do not have a system for managing configurations, you can use the deployment server included with Splunk.
Some advantages of the included deployment server are as follows:
- Everything you need is included in your Splunk installation
- It will restart forwarder instances properly when new app versions are deployed
- It is intelligent enough not to restart when unnecessary
- It will remove apps that should no longer be installed on a machine
- It will ignore apps that are not managed
- The logs for the deployment client and server are accessible in Splunk itself
Some disadvantages of the included deployment server are:
- As of Splunk 4.3, there are issues with scale beyond a few hundred deployment clients, at which point tuning is required (although a solution option is to use multiple instances of deployment servers).
- The configuration is complicated and prone to typos
With these caveats out of the way, let’s set up a deployment server for the apps that we laid out before.
Step 1 – deciding where your deployment server will run
For a small installation with less than a few dozen forwarders, your main Splunk instance can run the deployment server without any issue. For more than a few dozen forwarders, a separate instance of Splunk makes sense.
Ideally, this instance would run on its own machine. The requirements for this machine are not large, perhaps 4 gigabytes of RAM and two processors, or possibly less. A virtual machine would be fine.
Define a DNS entry for your deployment server, if at all possible. This will make moving your deployment server later, much simpler.
If you do not have access to another machine, you could run another copy of Splunk on the same machine that is running some other part of your Splunk deployment. To accomplish this, follow these steps:
- Install Splunk in another directory, perhaps /opt/splunk-deploy/splunk/.
- Start this instance of Splunk by using /opt/splunk-deploy/splunk/bin/splunk start. When prompted, choose different port numbers apart from the default and note what they are. I would suggest one number higher: 8090 and 8001.
- Unfortunately, if you run splunk enable boot-start in this new instance, the existing startup script will be overwritten. To accommodate both instances, you will need to either edit the existing startup script, or rename the existing script so that it is not overwritten.
Step 2 – defining your deploymentclient.conf configuration
Using the address of our new deployment server, ideally a DNS entry, we will build an app named deploymentclient-yourcompanyname. This app will have to be installed manually on forwarders but can then be managed by the deployment server.
This app should look somewhat like this:
deploymentclient-yourcompanyname local/deploymentclient.conf [deployment-client] [target-broker:deploymentServer] targetUri=deploymentserver.foo.com:8089
Step 3 – defining our machine types and locations
Starting with what we defined in the Separate configurations by purpose section, we have, in the locations west and east, the following machine types:
- Splunk indexers
- db servers
- Web servers
- App servers
Step 4 – normalizing our configurations into apps appropriately
Let’s use the apps that we defined in the section Separate configurations by purpose plus the deployment client app that we created in the Step 2 – defining your deploymentclient.conf configuration section. These apps will live in $SPLUNK_HOME/etc/deployment-apps/ on your deployment server.
Step 5 – mapping these apps to deployment clients in serverclass.conf
To get started, I always start with example 2 from SPLUNK_HOME/etc/system/README/serverclass.conf example:
[global] [serverClass:AppsForOps] whitelist.0=*.ops.yourcompany.com [serverClass:AppsForOps:app:unix] [serverClass:AppsForOps:app:SplunkLightForwarder]
Let’s assume that we have the machines mentioned next. It is very rare for an organization of any size to have consistently named hosts, so I threw in a couple of rogue hosts at the bottom, as follows:
spl-idx-west01 spl-idx-west02 spl-idx-east01 spl-idx-east02 app-east01 app-east02 app-west01 app-west02 web-east01 web-east02 web-west01 web-west02 db-east01 db-east02 db-west01 db-west02 qa01 homer-simpson
The structure of serverclass.conf is essentially as follows:
] #options that should be applied to all apps in this class [serverClass: :app: ] #options that should be applied only to this app in this serverclass
Please note that:
is an arbitrary name of your choosing. is the name of a directory in $SPLUNK_HOME/etc/deploymentapps/.
- The order of stanzas does not matter. Be sure to update
if you copy an :app: stanza. This is, by far, the easiest mistake to make.
It is important that configuration changes do not trigger a restart of indexers.
Let’s apply this to our hosts, as follows:
[global] restartSplunkd = True #by default trigger a splunk restart on configuration change ####INDEXERS ##handle indexers specially, making sure they do not restart [serverClass:indexers] whitelist.0=spl-idx-* restartSplunkd = False [serverClass:indexers:app:indexerbase] [serverClass:indexers:app:deploymentclient-yourcompanyname] [serverClass:indexers:app:props-web] [serverClass:indexers:app:props-app] [serverClass:indexers:app:props-db] #send props-west only to west indexers [serverClass:indexers-west] whitelist.0=spl-idx-west* restartSplunkd = False [serverClass:indexers-west:app:props-west] #send props-east only to east indexers [serverClass:indexers-east] whitelist.0=spl-idx-east* restartSplunkd = False [serverClass:indexers-east:app:props-east] ####FORWARDERS #send event parsing props apps everywhere #blacklist indexers to prevent unintended restart [serverClass:props] whitelist.0=* blacklist.0=spl-idx-* [serverClass:props:app:props-web] [serverClass:props:app:props-app] [serverClass:props:app:props-db] #send props-west only to west datacenter servers #blacklist indexers to prevent unintended restart [serverClass:west] whitelist.0=*-west* whitelist.1=qa01 blacklist.0=spl-idx-* [serverClass:west:app:props-west] [serverClass:west:app:deploymentclient-yourcompanyname] #send props-east only to east datacenter servers #blacklist indexers to prevent unintended restart [serverClass:east] whitelist.0=*-east* whitelist.1=homer-simpson blacklist.0=spl-idx-* [serverClass:east:app:props-east] [serverClass:east:app:deploymentclient-yourcompanyname] #define our appserver inputs [serverClass:appservers] whitelist.0=app-* whitelist.1=qa01 whitelist.2=homer-simpson [serverClass:appservers:app:inputs-app] #define our webserver inputs [serverClass:webservers] whitelist.0=web-* whitelist.1=qa01 whitelist.2=homer-simpson [serverClass:webservers:app:inputs-web] #define our dbserver inputs [serverClass:dbservers] whitelist.0=db-* whitelist.1=qa01 [serverClass:dbservers:app:inputs-db] #define our west coast forwarders [serverClass:fwd-west] whitelist.0=app-west* whitelist.1=web-west* whitelist.2=db-west* whitelist.3=qa01 [serverClass:fwd-west:app:outputs-west] #define our east coast forwarders [serverClass:fwd-east] whitelist.0=app-east* whitelist.1=web-east* whitelist.2=db-east* whitelist.3=homer-simpson [serverClass:fwd-east:app:outputs-east]
You should organize the patterns and classes in a way that makes sense to your organization and data centers, but I would encourage you to keep it as simple as possible. I would strongly suggest opting for more lines than more complicated logic.
A few more things to note about the format of serverclass.conf:
- The number following whitelist and blacklist must be sequential, starting with zero. For instance, in the following example, whitelist.3 will not be processed, since whitelist.2 is commented:
[serverClass:foo] whitelist.0=a* whitelist.1=b* # whitelist.2=c* whitelist.3=d*
- whitelist.x and blacklist.x are tested against these values in the following order:
- clientName as defined in deploymentclient.conf: This is not commonly used but is useful when running multiple Splunk instances on the same machine, or when the DNS is completely unreliable.
- IP address: There is no CIDR matching, but you can use string patterns.
- Reverse DNS: This is the value returned by the DNS for an IP address.
If your reverse DNS is not up to date, this can cause you problems, as this value is tested before the value of hostname, as provided by the host itself. If you suspect this, try ping
or something similar to see what the DNS is reporting.
- Hostname as provided by forwarder: This is always tested after reverse DNS, so be sure your reverse DNS is up to date.
- When copying :app: lines, be very careful to update the
appropriately! This really is the most common mistake made in serverclass.conf.
Step 6 – restarting the deployment server
If serverclass.conf did not exist, a restart of the Splunk instance which is running deployment server is required to activate the deployment server. After the deployment server is loaded, you can use the following command:
$SPLUNK_HOME/bin/splunk reload deploy-server
This command should be enough to pick up any changes in serverclass.conf a in etc/deployment-apps.
Step 7 – installing deploymentclient.conf
Now that we have a running deployment server, we need to set up the clients to call home. On each machine that will be running the deployment client, the procedure is essentially as follows:
- Copy the deploymentclient-yourcompanyname app to $SPLUNK_HOME/etc/apps/
- Restart Splunk
If everything is configured correctly, you should see the appropriate apps appear in $SPLUNK_HOME/etc/apps/, within a few minutes. To see what is happening, look at the log $SPLUNK_HOME/var/log/splunk/splunkd.log.
If you have problems, enable debugging on either the client or the server by editing $SPLUNK_HOME/etc/log.cfg, followed by a restart. Look for the following lines:
Once found, change them to the following lines and restart Splunk:
After restarting Splunk, you will see the complete conversation in $SPLUNK_HOME/var/log/splunk/splunkd.log. Be sure to change the setting back once you no longer need the verbose logging!
To summarize, we learned how to deploy a binary and set up configuration distribution in Splunk. If you’ve enjoyed this excerpt, head over to the book, Implementing Splunk 7 – Third Edition to learn how to use the Machine Learning Toolkit and best practices and tips to help you implement Splunk services effectively and efficiently.