12 min read

(for more resources related to this topic, see here.)

Using the advanced host configuration

Opsview offers an advanced way of defining hosts, and we have already seen some of these, such as exceptions, the SNMP setup, and host attributes; however, there is more.

The first one is the Other Addresses field that we can use to assign multiple IP addresses or hostnames to our host, and the second is the parent field that allows us to set up a structure that we can use to detect network outages.

Monitoring multi-homed hosts

Without the Other Addresses field, we would have to create unique hosts for each interface that we would like to monitor (in some cases, this can be a plus). This means that we need to maintain different configurations for each IP.

If your hosts are set up in an environment with a separate management and production network for instance, using Other Addresses allows you to keep all the checks within a single host, giving you a single view of that host.

To use this, go to settings | Basic | hosts, select your host to edit, and fill in any additional addresses that your host may have, as shown in the following screenshot (for clarity, we have used IPs but using hostnames is perfectly fine):

To use these addresses in our service check, Opsview (such as host attributes) uses a macro to make them available; to use the first address, simply use the macro $ADDRESS1$, for the second use $ADDRESS2$, and so on.

Remember the service check that we discussed while looking at host attributes. Well, here’s the same service check, but now we want it to run against the third IP address. So, we input the correct macro and we are done.

check_mysql -H $ADDRESS3$ -u %MYSQCREDENTIALS:1% -p %MYSQLCREDENTIALS:2% -d %MYSQLCREDENTIALS%

Using parenting for network outage detection

Let’s assume that our Opsview server is connected to switchA. RouterA is also connected to switchA. So, if switchA fails from the Opsview server’s point of view, RouterA becomes unreachable.

Unreachable simply means that there is another issue that prevents us from determining the real state of our host.

You can turn the notifications on and off for the Unreachable field in your Notification profile. Turning it off will prevent a lot of notifications during network outages.

To set up this relationship, we first need to edit the host, switchA from settings Basic | Hosts, and we have defined Opsview as its parent, as shown in the following screenshot:

Once submitted, we then edit the host, RouterA, and add switchA as its parent, as shown in the following screenshot:

After applying the changes through the settings menu, we can see our newly set up relations from the Monitoring | Status Detail | Network screen, as shown in the following screenshot:

Using parenting can really help while trying to figure out why a certain host (or hosts) have gone DOWN by letting Opsview detect if there is an issue with the network. It can also save you loads of notifications (which can be overwhelming if, for instance, a data center switch connecting hundreds of hosts fails).

The fastest and easiest way of determining the path from Opsview to any host is to simply run traceroute from the Opsview server and duplicate the path in Opsview.

Autodiscovery

The first feature we will look at is the autodiscovery tool that allows you to scan your network for hosts that could be monitored by Opsview.

There are two types of scans available from the feature: a basic network scan and a special VMware scan (more scan types may be added in the future).

Firewalls

If your network is protected (and divided) by firewalls, you might have trouble running scans (or get fewer results than you expected); this is caused by firewalls detecting the scan and marking it as unwanted traffic.

If you experience issues, make sure that you check the firewall and verify that it is allowing traffic from the Opsview server.

Network scan

To start a scan, navigate to settings | Basic | Auto-Discovery and click on the Network Scan button to bring up the configuration menu for our new scan.

There are two sections that can be used to perform our scan. The first section covers the basics such as a label for our scan, the IP addresses (you can use network statements such as 192.168.1.0/24), and a number of default settings, as shown in the following screenshot:

The second section (Detection Mapping) covers the detection mechanisms which include detection of common network services (such as SMTP, WWW, and so on), agent-based detection, and agentless detection using WMI, SNMP, and VMware vSphere host detection, as shown in the following screenshot:

Now, click on Save and you will see that a new job is created using our settings, and by clicking on our job, we can start the scan, edit, clone, or delete it.

Depending on the number of IPs to scan and the number of services set up in the detection section scans, an estimated time for completion is shown once you click on Save.

Once our job has finished, we can view the results by double-clicking on our job name, and we will be presented with a list of detected hosts and services.

You can edit each host or edit in bulk and then import them into Opsview by selecting the hosts and clicking on Import into Opsview. You can then see your new hosts on the settings | Basic | Hosts screen.

VMware scan

A VMware scan is a quick way of finding VM guests running in your virtual environment and importing them into Opsview.

Before we can run a VMware scan, there are some requirements we need to look at.

First, we must have the VMware vSphere SDK for Perl installed on our Opsview server (which you can download from the VMware website: https://my.vmware.com/web/vmware/details?downloadGroup=SDKPERL550&productId=353).

Next, each VM Guest should be running the VMware tools as we will use these to gather IP information about our guests from our VMware vSphere Hosts.

And finally, we need our VMware hosts that we can scan for using the network scan and enabling the detection mapping for VMware.

The VMware mapping is based on using the vSphere SDK to communicate with the VMware API; so make sure that the Opsview server can communicate with your vSphere servers using HTTPS.

Shown in the following screenshot is an example of a detected VMware vSphere host:

Click on VMware, enter the VMware credentials for the host, save it, and click on VMware scan from the Scan management tab to start a scan for VMs running on this host (or hosts if multiple vSphere hosts were detected).

This uses the same principles as the VMware detection mapping and uses the VMware API to communicate with vSphere.

Additionally, Opsview will automatically set the parent of the detected guests as the VMware host; so, you do not have to manually add the parent.

Once completed, select and import the VMs after which you can view the new hosts from the settings | Basic | Hosts list.

SNMP traps

Most network devices are capable of sending out SNMP traps when certain events occur (this is not restricted to network equipment though).

Using the SNMP trap receiver in Opsview allows you to catch these events and process them in Opsview.

Please note that setting up and using SNMP traps is a complex task, and being familiar with the command line and SNMP tools in general is highly recommended.

Traps received by Opsview are evaluated based on the originating host, and if this host has an SNMP trap service check assigned to it, it will be evaluated based on the rules of the service check (a host can have multiple SNMP trap service checks, and each service check can have multiple rules).

To use SNMP traps in Opsview, we will need to configure our system so that any incoming traps are forwarded to Opsview.

Configuration

Depending on your operating system, you will need to install the SNMPD packages that are required for SNMP traps.

On Debian/Ubuntu-based systems, run apt-get install snmpd to install the required packages. Once installed, we need to configure the following items so that we can use device-specific MIBs.

Add the following line to your snmp.conf file (file and location might vary depending on the operating system) to add the /usr/local/nagios/snmp/load directory (this is the directory where we can add device-specific MIBs to be used with Opsview).

mibdirs +/usr/local/nagios/snmp/load

Next, we need to set up the trap receiver by adding these lines to the /etc/default/snmpd configuration file:

TRAPDRUN=yes TRAPDOPTS='-t -m ALL -M /usr/share/snmp/mibs:/usr/local/nagios/snmp/ load -p /var/run/snmptrapd.pid' SNMPDOPTS='-u nagios -Lsd -Lf /dev/null -p/var/run/snmpd.pid'

Now, we need to configure the SNMP trap daemon to forward any events received to Opsview; for this, we edit the snmptrapd.conf configuration file and add:

traphandle default /usr/local/nagios/bin/snmptrap2nagios

And finally, we need to allow the user running Opsview (the nagios user) to be able to restart SNMP and the SNMP trap daemons if we load new MIBs and so on by adding the following line to our sudoers file using the visudo command:

nagios ALL=NOPASSWD: /usr/local/nagios/bin/snmpd reload

Once completed, we can now start using the SNMP traps in Opsview.

SNMP trap service check

Opsview Pro does come with a number of predefined SNMP trap service checks and host templates and they will cover the basics, but you can create your own checks with your own rules as we will see here.

Once configured, we can start creating SNMP trap service checks; only now we will have some additional options, as shown in the following screenshot:

By clicking on the Edit rules link at the bottom of the screen, we can edit any rules assigned to this check.

The rules form the basis of the SNMP trap system, and you can create multiple rules within one service check (the processing will stop when a rule that is considered TRUE is found and any subsequent rules are then skipped).

The following screenshot shows an example of a group of rules that will be checked against (taken from the SNMP Trap – Link State service check):

Exceptions

If for some reason the devices are sending traps that Opsview can’t process (due to lack of rules or missing MIBs), they will end up in the SNMP Trap Exceptions section located at settings | Advanced | SNMP Traps.

From this menu, you can view traps that have failed to match and view debug information that can be used to fix any rules or help in creating rules.

Rules

Making rules for SNMP traps works by using lines, values, and tags to perform matching. So let’s have a look at an example SMNP trap to see how we can create rules.

Here’s an example trap from a Cisco network device. This information will be similar to what you will find in the SNMP Trap Exceptions section. For clarity, we have added line numbers.

cisco2611.lon.altinity 192.168.10.20 SNMPv2-MIB::sysUpTime.0 9:16:47:53.80 SNMPv2-MIB::snmpTrapOID.0 IF-MIB::linkUp IF-MIB::ifIndex.2 2 IF-MIB::ifDescr.2 Serial0/0 IF-MIB::ifType.2 ppp SNMPv2-SMI::enterprises.9.2.2.1.1.20.2 "PPP LCP Open" SNMP-COMMUNITY-MIB::snmpTrapAddress.0 192.168.10.20 SNMP-COMMUNITY-MIB::snmpTrapCommunity.0 "public" SNMPv2-MIB::snmpTrapEnterprise.0 SNMPv2-SMI::enterprises.9.1.186

The first two lines are host information and used to map the trap to our host; we can use the remaining lines in our rules using the following macros.

${TRAPNAME} will map to the value snmpTrapOID.0 on the fourth line which in this case is IF-MIB::linkUp.

${Px} will map to the parameter on line x; so, for instance, ${P7} will map to the parameter (ifType) on the seventh line (any trailing numbers such as .2 are ignored).

${Vx} will map to the value on line x, so ${V6} will map to Serial0/0.

You can also directly call the value of an OID by placing it in your rule; so, for instance, ${SNMP-COMMUNITY-MIB::snmpTrapCommunity} will map to public.

Matching

So now that we have seen how we can take the information in our traps and use them, we need to look at how we can match against them.

For this, we can use simple Perl matching mechanisms such as eq or =~ (to do pattern matching).

So, a comparison to check if the SNMP trap community string is set to be used, the string public could be expressed in these different forms:

"${V10}" eq "public" "${SNMP-COMMUNITY-MIB::snmpTrapCommunity}" eq "public"

Both statements are the same; the first one is less readable. It might be unwanted if you wish to make a lot of rules and be able to quickly see what each rule does.

We can also combine matches in a single line, for instance, if we wish to check if the community is set to public and the interface reported is of a specific type.

"${V10}" eq "public" && "${V7}" eq "ppp"

Using the AND (&&) and OR (||) operators , we can create all kinds of combinations allowing for a finely tuned rule set.

Creating rule sets is best done by making good use of the exceptions as they contain the trap in exactly the form in which it will be processed (line, parameters, and so on). So, if you are planning on adding a new type of device and you need to develop new rules, make good use of the various screens.

Summary

This article covers a lot of exciting features that Opsview offers its users to help them get the most out of monitoring. We addressed some of the challenges that we face today in increasingly complex multi-homed environments and how we can resolve them using Opsview. We also learned to use the advanced modules that come with Opsview Pro, such as autodiscovery for rapid deployments. Using autodiscovery, we can quickly scan our network (or VMware hosts) for new and unmonitored devices; using SNMP traps, we can have our devices inform us in case of trouble.

resources for article:


further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here