xymon and snmp

Enfield Cat's Blog: Arduino and other projects.


Introduction

One area where xymon looks like it promises something only to underdeliver is in monitoring of SNMP enabled devices. There are hints that other people have done it successfully, but it would not appear to be that it has no native support. The result is a number of projects have been developed to let xymon such as devmon. Below I have a shell / awk script alternative which uses CSV files to define what is to be monitored.

There are several ways SNMP can work, and the method used in this script is to schedule "snmpwalk" on a 5 minute basis to collect data. That is we will pull data from the device rather than wait for it to push to us. It also allow RRD graphs to be generated for returned data, warning and critical thresholds to be set and to lookup integer reference numbers.

Prerequisites

The node running the snmp checks should have snmp utilities installed, specifically snmpwalk. It should be able to communicate to the anmp agents using UDP port161 and the agent should be able to return a response using UDP port 162. This last point is important if you have firewalls set up on your network. Once done run a test similar to this to confirm you can access the node:

nelson@jhbmonp01:~$ snmpwalk -c public -v 2c 192.168.1.1 .1.3.6.1.2.1.2.2.1.2
iso.3.6.1.2.1.2.2.1.2.1 = STRING: "eth0"
iso.3.6.1.2.1.2.2.1.2.2 = STRING: "eth1"
iso.3.6.1.2.1.2.2.1.2.3 = STRING: "eth2"
iso.3.6.1.2.1.2.2.1.2.4 = STRING: "eth3"

Configuration files

In this process we will also set up three comma separated varible (CSV) tables ro define what outputs we want. An advantage of csv files is that many spreadsheet programs such as LibreOffice or Excel can edit them. The CSV tables we will use are:


The client monitoring process

The code below is placed in a "xymon_snmp.sh" script in "/usr/lib/xymon/client/ext" directory for a standard package installation of the client. Remember to set the code to executable before scheduling it: "chmod 755 /usr/lib/xymon/client/ext/iostat.sh" Note this script is written in a way that simply running it at the command line will display the xymon data to the terminal screen.

Note If creating data for graphing, you will need to set up your own graph definitions.

Check: /usr/lib/xymon/client/ext/xymon_snmp.sh

###########################################################################
#
# Produce formatted output from snmp queries
#
###########################################################################
#
# Set up some common references:
#
if [ "$XYMON" = "" ] ; then XYMON="echo" ; fi
if [ "$XYMSRV" = "" ] ; then XYMSRV="send_2_xymon" ; fi
basedir="/etc/xymon"  # change according to to location of snmp_def.csv and snmp_targets.csv
statusFile="/tmp/xymon_snmp_status.txt"
#
# get list of machines, user names and credentials from snmp_targets.csv
# $1 = target IP
# $2 = Target Name
# $3 = Target type (network, printer, server)
# $4 = Community string
# $5 = User or "-" if not required
# $6 = Auth string or "-" if not required
# $7 = Color on failure or "-" if not required
# $8 = oid to test or default is "1.3.6.1.2.1.1.1.0"
#
cd "${basedir}"
awk -F ',' '
!/^#/ { print $1 " " $2 " " $3 " " $4 " " $5 " " $6 " " $7 " " $8 }
' snmp_targets.csv | while read targIP targName targType targComm targUser targAuth targFailColor targTestOid junk ; do
  #
  # Run through queries in the order presented in snmp_def.csv
  #
  if [ "${targUser}" != "" ]      && [ "${targUser}" != "-" ]      ; then userStr="-u ${targUser}" ; else userStr="" ; fi
  if [ "${targAuth}" != "" ]      && [ "${targAuth}" != "-" ]      ; then authStr="-A ${targAuth}" ; else authStr="" ; fi
  if [ "${targFailColor}" = "" ] || [ "${targFailColor}" = "-" ] ; then targFailColor="red" ; fi
  if [ "${targTestOid}" = "" ]   || [ "${targTestOid}" = "-" ]   ; then targTestOid="1.3.6.1.2.1.1.1.0" ; fi
  #
  # connectivity check
  #
  #nc -z -u "${targIP}" 161
  snmpwalk -c "${targComm}" -v2c ${userStr} ${authStr} "${targIP}" "${targTestOid}" 2>&1 > /dev/null
  if [ $? = 0 ] ; then
  #
  # clear old data
  #
  if [ -f /tmp/xymon_snmp_graph.txt ]  ; then rm /tmp/xymon_snmp_graph.txt ; fi
  if [ -f ${statusFile} ] ; then rm ${statusFile} ; fi
  #
  # Collect data and process it
  #
  awk -F ',' -v targIP="${targIP}" -v targComm="${targComm}" -v userStr="${userStr}" -v authStr="${authStr}" -v targType="${targType}" '
  !/^#/ {
  if ($1=="std" || $1==targType) {
    cmd = sprintf ("snmpwalk -c %s -v2c %s %s %s %s", targComm, userStr, authStr, targIP, $3)
    system (cmd)
  }
} ' snmp_def.csv | awk -F "," -v targName="${targName}" -v targType="${targType}" '
BEGIN {
  dfn       = 0
  tableRows = 0
  luSize    = 0
  formatStr = "%-16s %s\n"
  formatTab = "%s %12s"
}
!/^#/ {
  if (FILENAME == "snmp_def.csv") {
    if ($1=="std" || $1==targType) {
      format[dfn]   = $2
      oid[dfn]      = $3
      label[dfn]    = $4
      rrd[dfn]      = $5
      warning[dfn]  = $6
      critical[dfn] = $7
      lookup[dfn]   = $8
      # Create table headings if required
      if ($2 == "table") {
        table[0] = sprintf (formatTab, table[0], $4)
      }
      dfn++
    }
  }
  else if (FILENAME == "snmp_lookup.csv") {
    luSize++
    lookuppri[luSize] = $1
    lookupsec[luSize] = $2
    lookupval[luSize] = $3
  }
  else {
    referable = "F"
    tokenCnt = split ($0, token, " ")
    # Get string value from line
    if (token[3] == "STRING:") {
      split ($0, string, "\"")
      value = string[2]
      graphType = "none"
      }
    # Get Timeticks from line
    else if (token[3] == "Timeticks:") {
      printf (formatTick, $4)
      if (tokenCnt == 4) value = token[4]
      else {
        value = token[5]
        for (a=6; a<=tokenCnt; a++) value = sprintf ("%s %s", value, token[a])
      }
      graphType = "none"
    }
    else if (token[3] == "Counter64:" || token[3] == "Counter32:" || token[3] == "Gauge32:" || token[3] == "INTEGER:") {
      value = token[4]
      if (token[3] == "Counter64:" || token[3] == "Counter32:") graphType = "DERIVE"
      else graphType = "GAUGE"
      if (token[3] == "INTEGER:") referable = "T"
    }
    #
    # Now we have the values, lets format the output
    #
    for (idx=0; idx<dfn; idx++) {
      compar = substr(token[1], 1, length(oid[idx]))
      # the first part of the OID must match, AND the next char in the found oid should be a "."
      # otherwise we will confuse something.20 with something.2
      if (compar == oid[idx] && (length(oid[idx]) == length(token[1]) ||substr(token[1],length(oid[idx])+1,1)==".")) {
        if (referable == "T" && lookup[idx] != "") {
          graphType = none;
          for (lu=1; lu<=luSize; lu++) if (lookuppri[lu] == lookup[idx] && lookupsec[lu] == value) {
            value = lookupval[lu]
            lu = luSize+1
          }
        }
        if (format[idx] == "line") {
          printf (formatStr, label[idx], value)
          if (rrd[idx] != "" && graphType != "none") {
            graphData = sprintf ("%sDS:%s:%s:600:0:U %s\n", graphData, rrd[idx], graphType, value)
          }
          indexlabel=""
        }
        else if (format[idx] == "table") {
          tabPtr = split (token[1], tablar, ".")
          tabIdx = tablar[tabPtr]
          if (tabIdx>tableRows) tableRows = tabIdx
          if (indexor[tabIdx] == "") indexor[tabIdx] = value
          indexlabel = sprintf (" %s", indexor[tabIdx])
          table[tabIdx] = sprintf (formatTab, table[tabIdx], value)
          if (rrd[idx] != "" && graphType != "none") {
            graphLine[tabIdx] = sprintf ("%sDS:%s:%s:600:0:U %s\n", graphLine[tabIdx], rrd[idx], graphType, value)
          }
        }
        #
        # Now run comparison checks
        #
        thisError=""
        if (warning[idx] != "") {
          if ((value+0) >= (warning[idx]+0)) {
            thisError=sprintf ("&yellow %s%s %s >= %s\n", label[idx], indexlabel, value, warning[idx])
          }
        }
        if (critical[idx] != "") {
          if ((value+0) >= (critical[idx]+0)) {
            thisError=sprintf ("&red %s%s %s >= %s\n", label[idx], indexlabel, value, critical[idx])
          }
        }
        if (thisError != "") outError = sprintf ("%s%s", outError, thisError)
        #
        # if possible terminate the loop early
        #
        idx = dfn+1;
      }
    }
  }
}
END {
  if (table[0] != "") {
    # Print the status
    printf ("\n")
    for (line=0; line<=tableRows; line++) {
      print (table[line])
    }
    # Print the rrd data from single line format
    if (graphData != "") {
      printf ("[%s.rrd]\n", targType) > "/tmp/xymon_snmp_graph.txt"
      printf ("%s", graphData) >> "/tmp/xymon_snmp_graph.txt"
    }
    # Print the rrd data if in tabluar format
    if (graphLine[1] != "") {
      for (line=1; line<=tableRows; line++) {
         printf ("[%s.%s.rrd]\n", targType, indexor[line]) >> "/tmp/xymon_snmp_graph.txt"
         printf ("%s", graphLine[line]) >> "/tmp/xymon_snmp_graph.txt"
      }
    }
    if (outError != "") printf ("\n%s", outError)
  }
} ' snmp_def.csv snmp_lookup.csv - > ${statusFile}
COLOR="green"     # overall color
if [ ! -s ${statusFile} ] ; then
  echo "&${targFailColor} Warning: No SNMP data returned from SNMP agent on ${targName} (${targIP})" > ${statusFile}
  COLOR="${targFailColor}"  # overall color
fi
else
  echo "&${targFailColor} Error: Cannot access SNMP port of ${targName} (${targIP})" > ${statusFile}
  COLOR="${targFailColor}"     # overall color
fi
if [ -f ${statusFile} ] ; then
  if [ `grep -c '^\&yellow' ${statusFile}` -gt 0 ] ; then COLOR="yellow" ; fi
  if [ `grep -c '^\&red' ${statusFile}` -gt 0 ] ; then COLOR="red" ; fi

${XYMON} ${XYMSRV} "status ${targName}.${targType} ${COLOR} `date` - ${targType} SNMP collection
`cat ${statusFile}`
"
rm ${statusFile}
fi
if [ -f /tmp/xymon_snmp_graph.txt ] ; then
${XYMON} ${XYMSRV} "data ${targName}.trends
`cat /tmp/xymon_snmp_graph.txt`
"
rm /tmp/xymon_snmp_graph.txt
fi
done

exit 0

Scheduling: /var/run/xymon/clientlaunch-include.cfg

Add these lines in the client machine's xymon schedule: /var/run/xymon/clientlaunch-include.cfg

The client process log files should be available in /var/log/xymon/iostat.log

[snmp]
        ENVFILE $XYMONCLIENTHOME/etc/xymonclient.cfg
        CMD $XYMONCLIENTHOME/ext/xymon_snmp.sh
        LOGFILE $XYMONCLIENTLOGS/xymon_snmp.log
        INTERVAL 5m

Enable the test by restarting the xymon client:

sudo systemctl restart xymon-client

Server side changes

In order to get the iostat data to be included in a graph, some server side changes need to be made. Oddly even though ifmib graphs are in the graphs.cfg file they are not referred to in the xymonserver.cfg file! The "TEST2RRD=" field should contain "ifmib" and add "ifmib1::1" to "GRAPHS=" field. The last change will ensure that a separate graph is drawn for each interface in the trends display. Adding an additional variable after the GRAPHS= one is also recomended if using the ifmib examples shown on this page, this will add all 4 ifmib graphs to the ifmib test:

GRAPHS_ifmib="ifmib,ifmib1,ifmib2,ifmib3"

Finally I changed /etc/xymon/graphs.cfg for my test environment to show kilobits per second and introduced some color variation for ifmib graphs. This is mostly changing the ifmib lines from the original:

AREA:bitsin@RRDIDX@#00FF00:@RRDMETA@  inbound
LINE1:bitsout@RRDIDX@#0000FF:@RRDMETA@ outbound
to:
AREA:bitsin@RRDIDX@#@COLOR@:@RRDMETA@ @RRDPARAM@  inbound
LINE1:bitsout@RRDIDX@#@COLOR@:@RRDMETA@ @RRDPARAM@ outbound

So the end result was

########### ifmib graphs (NOTE: Preliminary) #####################
[ifmib]
        FNPATTERN ^ifmib.(.+).rrd
        TITLE Traffic
        YAXIS Bits/second
        DEF:bytesin@RRDIDX@=@RRDFN@:InOctets:AVERAGE
        DEF:bytesout@RRDIDX@=@RRDFN@:OutOctets:AVERAGE
        CDEF:bitsin@RRDIDX@=bytesin@RRDIDX@,8,*,300,/
        CDEF:bitsout@RRDIDX@=bytesout@RRDIDX@,8,*,300,/
        CDEF:kbitsin@RRDIDX@=bytesin@RRDIDX@,8,*,300000,/
        CDEF:kbitsout@RRDIDX@=bytesout@RRDIDX@,8,*,300000,/
        -l 0
        AREA:bitsin@RRDIDX@#@COLOR@:@RRDMETA@ @RRDPARAM@  inbound
        GPRINT:kbitsin@RRDIDX@:LAST: %6.1lfK (cur) \:
        GPRINT:kbitsin@RRDIDX@:MAX: %6.1lfK (max) \:
        GPRINT:kbitsin@RRDIDX@:MIN: %6.1lfK (min) \:
        GPRINT:kbitsin@RRDIDX@:AVERAGE: %6.1lfK (avg)\n
        LINE1:bitsout@RRDIDX@#@COLOR@:@RRDMETA@ @RRDPARAM@ outbound
        GPRINT:kbitsout@RRDIDX@:LAST: %6.1lfK (cur) \:
        GPRINT:kbitsout@RRDIDX@:MAX: %6.1lfK (max) \:
        GPRINT:kbitsout@RRDIDX@:MIN: %6.1lfK (min) \:
        GPRINT:kbitsout@RRDIDX@:AVERAGE: %6.1lfK (avg)\n
        #AREA:bitsin@RRDIDX@#00FF00:@RRDMETA@  inbound
        #LINE1:bitsout@RRDIDX@#0000FF:@RRDMETA@ outbound

[ifmib1]
        FNPATTERN ^ifmib.(.+).rrd
        TITLE Traffic
        YAXIS Packets/second
        DEF:pktsin@RRDIDX@=@RRDFN@:InUcastPkts:AVERAGE
        DEF:pktsout@RRDIDX@=@RRDFN@:OutUcastPkts:AVERAGE
        CDEF:kpktsin@RRDIDX@=pktsin@RRDIDX@,300000,/
        CDEF:kpktsout@RRDIDX@=pktsout@RRDIDX@,300000,/
        AREA:pktsin@RRDIDX@#@COLOR@:@RRDMETA@ @RRDPARAM@  inbound
        GPRINT:kpktsin@RRDIDX@:LAST: %6.1lfK (cur) \:
        GPRINT:kpktsin@RRDIDX@:MAX: %6.1lfK (max) \:
        GPRINT:kpktsin@RRDIDX@:MIN: %6.1lfK (min) \:
        GPRINT:kpktsin@RRDIDX@:AVERAGE: %6.1lfK (avg)\n
        LINE1:pktsout@RRDIDX@#@COLOR@:@RRDMETA@ @RRDPARAM@ outbound
        GPRINT:kpktsout@RRDIDX@:LAST: %6.1lfK (cur) \:
        GPRINT:kpktsout@RRDIDX@:MAX: %6.1lfK (max) \:
        GPRINT:kpktsout@RRDIDX@:MIN: %6.1lfK (min) \:
        GPRINT:kpktsout@RRDIDX@:AVERAGE: %6.1lfK (avg)\n

[ifmib2]
        FNPATTERN ^ifmib.(.+).rrd
        TITLE Errors
        YAXIS Errors
        DEF:pktsin@RRDIDX@=@RRDFN@:ifInErrors:AVERAGE
        DEF:pktsout@RRDIDX@=@RRDFN@:ifOutErrors:AVERAGE
        DEF:pktsdisc@RRDIDX@=@RRDFN@:ifOutDiscards:AVERAGE
        LINE1:pktsin@RRDIDX@#@COLOR@:@RRDMETA@ IN  @RRDPARAM@    errors
        GPRINT:pktsin@RRDIDX@:LAST: %6.1lf (cur) \:
        GPRINT:pktsin@RRDIDX@:MAX: %6.1lf (max) \:
        GPRINT:pktsin@RRDIDX@:MIN: %6.1lf (min) \:
        GPRINT:pktsin@RRDIDX@:AVERAGE: %6.1lf (avg)\n
        LINE1:pktsout@RRDIDX@#@COLOR@:@RRDMETA@ OUT @RRDPARAM@    errors
        GPRINT:pktsout@RRDIDX@:LAST: %6.1lf (cur) \:
        GPRINT:pktsout@RRDIDX@:MAX: %6.1lf (max) \:
        GPRINT:pktsout@RRDIDX@:MIN: %6.1lf (min) \:
        GPRINT:pktsout@RRDIDX@:AVERAGE: %6.1lf (avg)\n
        LINE1:pktsdisc@RRDIDX@#@COLOR@:@RRDMETA@ OUT @RRDPARAM@ discarded
        GPRINT:pktsdisc@RRDIDX@:LAST: %6.1lf (cur) \:
        GPRINT:pktsdisc@RRDIDX@:MAX: %6.1lf (max) \:
        GPRINT:pktsdisc@RRDIDX@:MIN: %6.1lf (min) \:
        GPRINT:pktsdisc@RRDIDX@:AVERAGE: %6.1lf (avg)\n

[ifmib3]
        FNPATTERN ^ifmib.(.+).rrd
        TITLE Output queue
        YAXIS Packets
        DEF:qlen@RRDIDX@=@RRDFN@:ifOutQLen:AVERAGE
        LINE1:qlen@RRDIDX@#@COLOR@:@RRDMETA@ @RRDPARAM@ Queue length
        GPRINT:qlen@RRDIDX@:LAST: %6.1lf (cur) \:
        GPRINT:qlen@RRDIDX@:MAX: %6.1lf (max) \:
        GPRINT:qlen@RRDIDX@:MIN: %6.1lf (min) \:
        GPRINT:qlen@RRDIDX@:AVERAGE: %6.1lf (avg)\n



Thank you for visiting camelthorn.cloudHome