17 min read

In this article by Andrea Dalle Vacche and Stefano Kewan Lee, author of Zabbix Network Monitoring Essentials, we will learn the different possibilities Zabbix offers to the enterprising network administrator.

There are certainly many advantages in using Zabbix’s own agents and protocol when it comes to monitoring Windows and Unix operating systems or the applications that run on them. However, when it comes to network monitoring, the vast majority of monitored objects are network appliances of various kinds, where it’s often impossible to install and run a dedicated agent of any type. This by no means implies that you’ll be unable to fully leverage Zabbix’s power to monitor your network. Whether it’s a simple ICMP echo request, an SNMP query, an SNMP trap, netflow logging, or a custom script, there are many possibilities to extract meaningful data from your network. This section will show you how to set up these different methods of gathering data, and give you a few examples on how to use them.

(For more resources related to this topic, see here.)

Simple checks

An interesting use case is using one or more net.tcp.service items to make sure that some services are not running on a given interface. Take for example, the case of a border router or firewall. Unless you have some very special and specific needs, you’ll typically want to make sure that no admin consoles are available on the external interfaces. You might have double-checked the appliance’s initial configuration, but a system update, a careless admin, or a security bug might change the aforesaid configuration and open your appliance’s admin interfaces to a far wider audience than intended. A security breach like this one could pass unobserved for a long time unless you configure a few simple TCP/IP checks on your appliance’s external interfaces and then set up some triggers that will report a problem if those checks report an open and responsive port.

Let’s take the example of the router with two production interfaces and a management interface shown in the section about host interfaces. If the router’s HTTPS admin console is available on TCP port 8000, you’ll want to configure a simple check item for every interface:

Item name

Item key

management_https_console

net.tcp.service[https,192.168.1.254,8000]

zoneA_https_console

net.tcp.service[https,10.10.1.254,8000]

zoneB_https_console

net.tcp.service[https,172.16.7.254,8000]

All these checks will return 1 if the service is available, and 0 if the service is not available. What changes is how you implement the triggers on these items. For the management item, you’ll have a problem if the service is not available, while for the other two, you’ll have a problem if the service is indeed available, as shown in the following table:

Trigger name

Trigger expression

Management console down

{it-1759-r1:net.tcp.service[http,192.168.1.254,8000].last()}=0

Console available from zone A

{it-1759-r1:net.tcp.service[http,10.10.1.254,8000].last()}=1

Console available from zone B

{it-1759-r1:net.tcp.service[http,172.16.7.254,8000].last()}=1

This way, you’ll always be able to make sure that your device’s configuration when it comes to open or closed ports will always match your expected setup and be notified when it diverges from the standard you set.

To summarize, simple checks are great for all cases where you don’t need complex monitoring data from your network as they are quite fast and lightweight. For the same reason, they could be the preferred solution if you have to monitor availability for hundreds to thousands of hosts as they will impart a relatively low overhead on your overall network traffic.

When you do need more structure and more detail in your monitoring data, it’s time to move to the bread and butter of all network monitoring solutions: SNMP.

Keeping SNMP simple

The Simple Network Monitoring Protocol (SNMP) is an excellent, general purpose protocol that has become widely used beyond its original purpose. When it comes to network monitoring though, it’s also often the only protocol supported by many appliances, so it’s often a forced, albeit natural and sensible, choice to integrate it into your monitoring scenarios. As a network administrator, you probably already know all there is to know about SNMP and how it works, so let’s focus on how it’s integrated into Zabbix and what you can do with it.

Mapping SNMP OIDs to Zabbix items

An SNMP value is composed of three different parts: the OID, the data type, and the value itself. When you use snmpwalk or snmpget to get values from an SNMP agent, the output looks like this:

SNMPv2-MIB::sysObjectID.0 = OID: CISCO-PRODUCTS-MIB::cisco3640
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (83414) 0:13:54.14
SNMPv2-MIB::sysContact.0 = STRING:
SNMPv2-MIB::sysName.0 = STRING: R1
SNMPv2-MIB::sysLocation.0 = STRING: Upper floor room 13
SNMPv2-MIB::sysServices.0 = INTEGER: 78
SNMPv2-MIB::sysORLastChange.0 = Timeticks: (0) 0:00:00.00
...
IF-MIB::ifPhysAddress.24 = STRING: c4:1:22:4:f2:f
IF-MIB::ifPhysAddress.26 = STRING:
IF-MIB::ifPhysAddress.27 = STRING: c4:1:1e:c8:0:0
IF-MIB::ifAdminStatus.1 = INTEGER: up(1)
IF-MIB::ifAdminStatus.2 = INTEGER: down(2)

And so on.

The first part, the one before the = sign is, naturally, the OID. This will go into the SNMP OID field in the Zabbix item creation page and is the unique identifier for the metric you are interested in. Some OIDs represent a single and unique metric for the device, so they are easy to identify and address. In the above excerpt, one such OID is DISMAN-EVENT-MIB::sysUpTimeInstance. If you are interested in monitoring that OID, you’d only have to fill out the item creation form with the OID itself and then define an item name, a data type, and a retention policy, and you are ready to start monitoring it. In the case of an uptime value, time-ticks are expressed in seconds, so you’ll choose a numeric decimal data type. We’ll see in the next section how to choose Zabbix item data types and how to store values based on SNMP data types. You’ll also want to store the value as is and optionally specify a unit of measure. This is because an uptime is already a relative value as it expresses the time elapsed since a device’s latest boot. There would be no point in calculating a further delta when getting this measurement. Finally, you’ll define a polling interval and choose a retention policy. In the following example, the polling interval is shown to be 5 minutes (300 seconds), the history retention policy as 3 days, and the trend storage period as one year. These should be sensible values as you don’t normally need to store the detailed history of a value that either resets to zero, or, by definition, grows linearly by one tick every second.

The following screenshot encapsulates what has been discussed in this paragraph:

Zabbix Network Monitoring Essentials

Remember that the item’s key value still has to be unique at the host/template level as it will be referenced to by all other Zabbix components, from calculated items to triggers, maps, screens, and so on. Don’t forget to put the right credentials for SNMPv3 if you are using this version of the protocol.

Many of the more interesting OIDs, though, are a bit more complex: multiple OIDs can be related to one another by means of the same index. Let’s look at another snmpwalk output excerpt:

IF-MIB::ifNumber.0 = INTEGER: 26
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.2 = INTEGER: 2
IF-MIB::ifIndex.3 = INTEGER: 3

IF-MIB::ifDescr.1 = STRING: FastEthernet0/0
IF-MIB::ifDescr.2 = STRING: Serial0/0
IF-MIB::ifDescr.3 = STRING: FastEthernet0/1

IF-MIB::ifType.1 = INTEGER: ethernetCsmacd(6)
IF-MIB::ifType.2 = INTEGER: propPointToPointSerial(22)
IF-MIB::ifType.3 = INTEGER: ethernetCsmacd(6)

IF-MIB::ifMtu.1 = INTEGER: 1500
IF-MIB::ifMtu.2 = INTEGER: 1500
IF-MIB::ifMtu.3 = INTEGER: 1500

IF-MIB::ifSpeed.1 = Gauge32: 10000000
IF-MIB::ifSpeed.2 = Gauge32: 1544000
IF-MIB::ifSpeed.3 = Gauge32: 10000000

IF-MIB::ifPhysAddress.1 = STRING: c4:1:1e:c8:0:0
IF-MIB::ifPhysAddress.2 = STRING:
IF-MIB::ifPhysAddress.3 = STRING: c4:1:1e:c8:0:1

IF-MIB::ifAdminStatus.1 = INTEGER: up(1)
IF-MIB::ifAdminStatus.2 = INTEGER: down(2)
IF-MIB::ifAdminStatus.3 = INTEGER: down(2)

IF-MIB::ifOperStatus.1 = INTEGER: up(1)
IF-MIB::ifOperStatus.2 = INTEGER: down(2)
IF-MIB::ifOperStatus.3 = INTEGER: down(2)

IF-MIB::ifLastChange.1 = Timeticks: (1738) 0:00:17.38
IF-MIB::ifLastChange.2 = Timeticks: (1696) 0:00:16.96
IF-MIB::ifLastChange.3 = Timeticks: (1559) 0:00:15.59

IF-MIB::ifInOctets.1 = Counter32: 305255
IF-MIB::ifInOctets.2 = Counter32: 0
IF-MIB::ifInOctets.3 = Counter32: 0

IF-MIB::ifInDiscards.1 = Counter32: 0
IF-MIB::ifInDiscards.2 = Counter32: 0
IF-MIB::ifInDiscards.3 = Counter32: 0

IF-MIB::ifInErrors.1 = Counter32: 0
IF-MIB::ifInErrors.2 = Counter32: 0
IF-MIB::ifInErrors.3 = Counter32: 0

IF-MIB::ifOutOctets.1 = Counter32: 347968
IF-MIB::ifOutOctets.2 = Counter32: 0
IF-MIB::ifOutOctets.3 = Counter32: 0

As you can see, for every network interface, there are several OIDs, each one detailing a specific aspect of the interface: its name, its type, whether it’s up or down, the amount of traffic coming in or going out, and so on. The different OIDs are related through their last number, the actual index of the OID. Looking at the preceding excerpt, we know that the device has 26 interfaces, of which we are showing some values for just the first three. By correlating the index numbers, we also know that interface 1 is called FastEthernet0/0, its MAC address is c4:1:1e:c8:0:0, the interface is up and has been up for just 17 seconds, and some traffic already went through it.

Now, one way to monitor several of these metrics for the same interface is to manually correlate these values when creating the items, putting the complete OID in the SNMP OID field, and making sure that both the item key and its name reflect the right interface. This process is not only prone to errors during the setup phase, but it could also introduce some inconsistencies down the road. There is no guarantee, in fact, that the index will remain consistent across hardware or software upgrades or even across configurations when it comes to more volatile states like the number of VLANs or routing tables instead of network interfaces. Fortunately Zabbix provides a feature, called dynamic indexes, that allows you to actually correlate different OIDs in the same SNMP OID field so that you can define an index based on the index exposed by another OID.

This means that if you want to know the admin status of FastEthernet0/0, you don’t need to find the index associated with FastEthernet0/0 (in this case it would be 1) and then add that index to IF-MIB::ifAdminStatus of the base OID, hoping that it won’t ever change in the future. You can instead use the following code:

IF-MIB::ifAdminStatus["index", "IF-MIB::ifDescr",   "FastEthernet0/0"]

Upon using the preceding code in the SNMP OID field of your item, the item will dynamically find the index of the IF-MIB::ifDescr OID where the value is FastEthernet0/0 and append it to IF-MIB::ifAdminStatus in order to get the right status for the right interface.

If you organize your items this way, you’ll always be sure that related items actually show the right related values for the component you are interested in and not those of another one because things changed on the device’s side without your knowledge. Moreover, we’ll build on this technique to develop low-level discovery of a device.

You can use the same technique to get other interesting information out of a device. Consider, for example, the following excerpt:

ENTITY-MIB::entPhysicalVendorType.1 = OID: CISCO-ENTITY-VENDORTYPEOID-
MIB::cevChassis3640
ENTITY-MIB::entPhysicalVendorType.2 = OID: CISCO-ENTITY-VENDORTYPEOID-
MIB::cevContainerSlot
ENTITY-MIB::entPhysicalVendorType.3 = OID: CISCO-ENTITY-VENDORTYPEOID-
MIB::cevCpu37452fe
ENTITY-MIB::entPhysicalClass.1 = INTEGER: chassis(3)
ENTITY-MIB::entPhysicalClass.2 = INTEGER: container(5)
ENTITY-MIB::entPhysicalClass.3 = INTEGER: module(9)
ENTITY-MIB::entPhysicalName.1 = STRING: 3745 chassis
ENTITY-MIB::entPhysicalName.2 = STRING: 3640 Chassis Slot 0
ENTITY-MIB::entPhysicalName.3 = STRING: c3745 Motherboard with Fast
Ethernet on Slot 0
ENTITY-MIB::entPhysicalHardwareRev.1 = STRING: 2.0
ENTITY-MIB::entPhysicalHardwareRev.2 = STRING:
ENTITY-MIB::entPhysicalHardwareRev.3 = STRING: 2.0
ENTITY-MIB::entPhysicalSerialNum.1 = STRING: FTX0945W0MY
ENTITY-MIB::entPhysicalSerialNum.2 = STRING:
ENTITY-MIB::entPhysicalSerialNum.3 = STRING: XXXXXXXXXXX

It should be immediately clear to you that you can find the chassis’s serial number by creating an item with:

ENTITY-MIB::entPhysicalSerialNum["index", "ENTITY-MIB::entPhysicalName", "3745 chassis"]

Then you can specify, in the same item, that it should populate the Serial Number field of the host’s inventory. This is how you can have a more automatic, dynamic population of inventory fields.

The possibilities are endless as we’ve only just scratched the surface of what any given device can expose as SNMP metrics. Before you go and find your favorite OIDs to monitor though, let’s have a closer look at the preceding examples, and let’s discuss data types.

Getting data types right

We have already seen how an OID’s value has a specific data type that is usually clearly stated with the default snmpwalk command. In the preceding examples, you can clearly see the data type just after the = sign, before the actual value. There are a number of SNMP data types—some still current and some deprecated. You can find the official list and documentation in RFC2578 (http://tools.ietf.org/html/rfc2578), but let’s have a look at the most important ones from the perspective of a Zabbix user:

SNMP type

Description

Suggested Zabbix item type and options

INTEGER

This can have negative values and is usually used for enumerations

  • Numeric unsigned, decimal
  • Store value as is
  • Show with value mappings

STRING

This is a regular character string and can contain new lines

  • Text
  • Store value as is

OID

This is an SNMP object identifier

  • Character
  • Store value as is

IpAddress

IPv4 only

  • Character
  • Store value as is

Counter32

This includes only non-negative and nondecreasing values

  • Numeric unsigned, decimal
  • Store value as delta (speed per second)

Gauge32

This includes only non-negative values, which can decrease

  • Numeric unsigned, decimal
  • Store value as is

Counter64

This includes non-negative and nondecreasing 64-bit values

  • Numeric unsigned, decimal
  • Store value as delta (speed per second)

TimeTicks

This includes non-negative, nondecreasing values

  • Numeric unsigned, decimal
  • Store value as is

First of all, remember that the above suggestions are just that—suggestions. You should always evaluate how to store your data on a case-by-case basis, but you’ll probably find that in many cases those are indeed the most useful settings.

Moving on to the actual data types, remember that the command line SNMP tools by default parse the values and show some already interpreted information. This is especially true for Timeticks values and for INTEGER values when these are used as enumerations. In other words, you see the following from the command line:

VRRP-MIB::vrrpNotificationCntl.0 = INTEGER: disabled(2)

However, what is actually passed as a request is the bare OID:

1.3.6.1.2.1.68.1.2.0

The SNMP agent will respond with just the value, which, in this case, is the value 2.

This means that in the case of enumerations, Zabbix will just receive and store a number and not the string disabled(2) as seen from the command line. If you want to display monitoring values that are a bit clearer, you can apply value mappings to your numeric items. Value maps contain the mapping between numeric values and arbitrary string representations for a human-friendly representation. You can specify which one you need in the item configuration form, as follows:

Zabbix Network Monitoring Essentials

Zabbix comes with a few predefined value mappings. You can create your own mappings by following the show value mappings link and, provided you have admin roles on Zabbix, you’ll be taken to a page where you can configure all value mappings that will be used by Zabbix. From there, click on Create value map in the upper-right corner of the page, and you’ll be able to create a new mapping. Not all INTEGER values are enumerations, but those that are used as such will be clearly recognizable from your command-line tools as they will be defined as INTEGER values but will show a string label along with the actual value, just as in the preceding example.

On the other hand, when they are not used as enumerations, they can represent different things depending on the context. As seen in the previous paragraph, they can represent the number of indexes available for a given OID. They can also represent application or protocol-specific values, such as default MTU, default TTL, route metrics, and so on.

The main difference between gauges, counters, and integers is that integers can assume negative values, while gauges and counters cannot. In addition to that, counters can only increase or wrap around and start again from the bottom of their value range once they reach the upper limits of it. From the perspective of Zabbix, this marks the difference in how you’ll want to store their values.

Gauges are usually employed when a value can vary within a given range, such as the speed of an interface, the amount of free memory, or any limits and timeouts you might find for notifications, the number of instances, and so on. In all of these cases, the value can increase or decrease in time, so you’ll want to store them as they are because once put on a graph, they’ll draw a meaningful curve.

Counters, on the other hand, can only increase by definition. They are typically used to show how many packets were processed by an interface, how many were dropped, how many errors were encountered, and so on. If you store counter values as they are, you’ll find in your graphs some ever-ascending curves that won’t tell you very much for your monitoring or capacity planning purposes. This is why you’ll usually want to track a counter’s amount of change in time, more than its actual value. To do that, Zabbix offers two different ways to store deltas or differences between successive values.

The delta (simple change) storage method does exactly what it says: it simply computes the difference between the currently received value and the previously received one, and stores the result. It doesn’t take into consideration the elapsed time between the two measurements, nor the fact that the result can even have a negative value if the counter overflows. The fact is that most of the time, you’ll be very interested in evaluating how much time has passed between two different measurements and in treating correctly any negative values that can appear as a result.

The delta (speed per second) will divide the difference between the currently received value and the previously received one by the difference between the current timestamp and the previous one, as follows:

(value – prev_value)/(time - prev_time)

This will ensure that the scale of the change will always be constant, as opposed to the scale of the simple change delta, which will vary every time you modify the update interval of the item, giving you inconsistent results. Moreover, the speed-per-second delta will ignore any negative values and just wait for the next measurement, so you won’t find any false dips in your graph due to overflowing.

Finally, while SNMP uses specific data types for IP addresses and SNMP OIDs, there are no such types in Zabbix, so you’ll need to map them to some kind of string item. The suggested type here is character as both values won’t be bigger than 255 characters and won’t contain any newlines.

String values, on the other hand, can be quite long as the SNMP specification allows for 65,535-character-long texts; however, text that long would be of little practical value. Even if they are usually much shorter, string values can often contain newlines and be longer than 255 characters.

Consider, for example, the following SysDescr OID for this device:

NMPv2-MIB::sysDescr.0 = STRING: Cisco IOS Software, 3700 Software
(C3745-ADVENTERPRISEK9_SNA-M), Version 12.4(15)T14, RELEASE SOFTWARE
(fc2)^M
Technical Support: http://www.cisco.com/techsupport^M
Copyright (c) 1986-2010 by Cisco Systems, Inc.^M
Compiled Tue 17-Aug-10 12:56 by prod_rel_tea

As you can see, the string spans multiple lines, and it’s definitely longer than 255 characters. This is why the suggested type for string values is text as it allows text of arbitrary length and structure. On the other hand, if you’re sure that a specific OID value will always be much shorter and simpler, you can certainly use the character data type for your corresponding Zabbix item.

Now, you are truly ready to get the most out of your devices’ SNMP agents as you are now able to find the OID you want to monitor and map them perfectly to Zabbix items, down to how to store the values, their data types, with what frequency, and with any value mapping that might be necessary.

Summary

In this article, you have learned the different possibilities offered by Zabbix to the enterprising network administrator.

You should now be able to choose, design, and implement all the monitoring items you need, based on the methods illustrated in the preceding paragraphs.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here