This is the stats collector for xymon to collect Varnish cache. Varnish acts as a cache to backend servers, and can include rules as to what gets cached, how to clear or modify cookies etc.
The code below is placed in a "varnish.sh" script in "usr/lib/xymon/client/ext" directory for a standard package installation of the client. Remember to set the code to executable before scheduling it: "chmod 755 /usr/lib/xymon/client/ext/varnish.sh" Note this script is written in a way that simply running it at the command line will display the xymon data to the terminal screen.
###########################################################################
#
# Collect varnish stats, but only if varnish is running.
#
###########################################################################
#
# Set up some base variables
#
COLUMN="varnish"
export column
if [ -z "$XYMSRV" ] ; then
XYMSRV=""
XYMON="echo"
fi
if [ -z "$MACHINE" ] ; then
MACHINE=`/bin/uname -n`
fi
COLOR="green" # overall color
#
# Collect the data
#
if [ -x /usr/bin/varnishstat ] ; then
chk=`/bin/ps -ef | /bin/grep varnishd | /bin/grep -vc grep`
if [ $chk -gt 0 ] ; then
rawdata=`/usr/bin/varnishstat -1 2>/dev/null`
payload=`echo "${rawdata}" | egrep "client_req |cache_|n_lru_nuked|sess_queued|sess_dropped" | awk ' { printf ("%6s ", $2); for (a=4;a<=NF;a++) printf(" %s", $a); printf ("\n") } '`
errval=`echo "${rawdata}" | egrep "n_lru_nuked|sess_queued|sess_dropped" | awk ' { cnt += $2 } END { print cnt }'`
if [ ${errval} -gt 0 ] ; then
COLOR="yellow"
suppmsg="&yellow check for nuked objects and queued or dropped sessions, cache capacity may be too small?"
fi
${XYMON} ${XYMSRV} "status ${MACHINE}.${COLUMN} ${COLOR} `date` - Varnish Cache
${payload}
"
payload=`echo "${rawdata}" | awk '
/backend_conn / { backend = sprintf ("%sDS:conn:DERIVE:600:0:U %s\n", backend, $2) }
/backend_unhealthy / { backend = sprintf ("%sDS:unhealthy:DERIVE:600:0:U %s\n", backend, $2) }
/backend_busy / { backend = sprintf ("%sDS:busy:DERIVE:600:0:U %s\n", backend, $2) }
/backend_fail / { backend = sprintf ("%sDS:fail:DERIVE:600:0:U %s\n", backend, $2) }
/backend_reuse / { backend = sprintf ("%sDS:reuse:DERIVE:600:0:U %s\n", backend, $2) }
/backend_retry / { backend = sprintf ("%sDS:retry:DERIVE:600:0:U %s\n", backend, $2) }
/backend_recycle / { backend = sprintf ("%sDS:recycle:DERIVE:600:0:U %s\n", backend, $2) }
/client_req / { requests = sprintf ("%sDS:req:DERIVE:600:0:U %s\n", requests, $2) }
/backend_req / { requests = sprintf ("%sDS:backend_req:DERIVE:600:0:U %s\n", requests, $2) }
/cache_hit / { cache = sprintf ("%sDS:hit:DERIVE:600:0:U %s\n", cache, $2) }
/cache_hitpass / { cache = sprintf ("%sDS:hitpass:DERIVE:600:0:U %s\n", cache, $2) }
/cache_miss / { cache = sprintf ("%sDS:miss:DERIVE:600:0:U %s\n", cache, $2) }
END {
printf ("[varnish_backend_connections.rrd]\n")
printf ("%s\n", backend)
printf ("[varnish_requests.rrd]\n")
printf ("%s\n", requests)
printf ("[varnish_cache.rrd]\n")
printf ("%s\n", cache)
}'`
${XYMON} ${XYMSRV} "data ${MACHINE}.trends
${payload}
"
fi
fi
exit 0
Add these lines in the client machine's xymon schedule: /var/run/xymon/clientlaunch-include.cfg
The client process log files should be available in /var/log/xymon/varnish.log
[varnish]
ENVFILE $XYMONCLIENTHOME/etc/xymonclient.cfg
CMD $XYMONCLIENTHOME/ext/varnish.sh
LOGFILE $XYMONCLIENTLOGS/varnish.log
INTERVAL 5m
xymon most probably does not have access to the varnish cache data, but this can be rectified by making xymon a member of the varnish group.
sudo usermod -a -G varnish xymon sudo systemctl restart xymon-client
In order to get the varnish data to be included in a graph, some server side changes need to be made. This includes adjusting the server configuration to include the graph on the "varnish" check and in the "trends" check. The graph definition also needs to be made.
After adding graph data to xymon for the first time allow up to 20 minutes before expecting the first data to get graphed. The system needs to recognise changes to the graph definitions, which it will generall do so automatically. Then it also needs time to get initial start and end points a graphs data.
There are 2 variables in /etc/xymon/xymonserver.cfg which are if interest. Both contain comma separated lists of values within quotes. If changing this file be sure your new test is included within quotes and is separated by a comma from other fields. Do not use spaces in these variables.
[varnish_requests]
FNPATTERN ^varnish_requests.rrd
TITLE Varnish requests
YAXIS avg requests/sec
DEF:req@RRDIDX@=@RRDFN@:req:AVERAGE
DEF:backend_req@RRDIDX@=@RRDFN@:backend_req:AVERAGE
AREA:req@RRDIDX@#@COLOR@:Client requests
GPRINT:req@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:req@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:req@RRDIDX@:MAX: \: %5.1lf (max)\n
AREA:backend_req@RRDIDX@#@COLOR@:Backend requests
GPRINT:backend_req@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:backend_req@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:backend_req@RRDIDX@:MAX: \: %5.1lf (max)\n
[varnish_cache]
FNPATTERN ^varnish_cache.rrd
TITLE Varnish cache
YAXIS avg requests/sec
DEF:hit@RRDIDX@=@RRDFN@:hit:AVERAGE
DEF:hitpass@RRDIDX@=@RRDFN@:hitpass:AVERAGE
DEF:miss@RRDIDX@=@RRDFN@:miss:AVERAGE
LINE1:hit@RRDIDX@#1E940F:Cache hit
GPRINT:hit@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:hit@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:hit@RRDIDX@:MAX: \: %5.1lf (max) \n
LINE1:hitpass@RRDIDX@#718C0E:Cache hitpass
GPRINT:hitpass@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:hitpass@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:hitpass@RRDIDX@:MAX: \: %5.1lf (max) \n
LINE1:miss@RRDIDX@#B31B00:Cache miss
GPRINT:miss@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:miss@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:miss@RRDIDX@:MAX: \: %5.1lf (max) \n
[varnish_backend_connections]
FNPATTERN ^varnish_backend_connections.rrd
TITLE Varnish backend connections
YAXIS avg connections/sec
DEF:conn@RRDIDX@=@RRDFN@:conn:AVERAGE
DEF:unhealthy@RRDIDX@=@RRDFN@:unhealthy:AVERAGE
DEF:busy@RRDIDX@=@RRDFN@:busy:AVERAGE
DEF:fail@RRDIDX@=@RRDFN@:fail:AVERAGE
DEF:reuse@RRDIDX@=@RRDFN@:reuse:AVERAGE
DEF:retry@RRDIDX@=@RRDFN@:retry:AVERAGE
DEF:recycle@RRDIDX@=@RRDFN@:recycle:AVERAGE
LINE1:conn@RRDIDX@#605C59:Conn success \t\t
GPRINT:conn@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:conn@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:conn@RRDIDX@:MAX: \: %5.1lf (max) \n
LINE1:unhealthy@RRDIDX@#D2AE84:Conn not attempted \t
GPRINT:unhealthy@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:unhealthy@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:unhealthy@RRDIDX@:MAX: \: %5.1lf (max) \n
LINE1:busy@RRDIDX@#C9C5C0:Conn too many \t\t
GPRINT:busy@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:busy@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:busy@RRDIDX@:MAX: \: %5.1lf (max) \n
LINE1:fail@RRDIDX@#9F3E81:Conn failures \t\t
GPRINT:fail@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:fail@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:fail@RRDIDX@:MAX: \: %5.1lf (max) \n
LINE1:reuse@RRDIDX@#C6BE91:Conn reuses \t\t
GPRINT:reuse@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:reuse@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:reuse@RRDIDX@:MAX: \: %5.1lf (max) \n
LINE1:retry@RRDIDX@#FD7F00:Conn retry \t\t
GPRINT:retry@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:retry@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:retry@RRDIDX@:MAX: \: %5.1lf (max) \n
LINE1:recycle@RRDIDX@#6E4E40:Conn recycles \t\t
GPRINT:recycle@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:recycle@RRDIDX@:AVERAGE: \: %5.1lf (avg)
GPRINT:recycle@RRDIDX@:MAX: \: %5.1lf (max) \n
If the /etc/xymon/graphs.cfg does not already include everything in the /etc/xymon/graphs.d/ directory, add an entry for this file:
include /etc/xymon/graphs.d/varnish.cfg
| Thank you for visiting camelthorn.cloud | Home |