Friday 27 April 2012

Simple Monitoring of VPN links

    I use this script to check the status of non-critical vpn links, but you could use it for anything where a simple ping test is sufficient and you couldn't be bothered setting up SNMP.

Of course SNMP + Nagios is a better choice for mission critical monitoring than simply pinging something like I do here.

Here is the script;
#!/bin/bash

TIMEOUT=5  # Number of failed attempts before sending email
STATUSDIR=/tmp/vpn # Status files are written here
ADDRESS=$1  # The target address to be monitored
EMAIL=$2  # Optional email address to send alerts to

if [ ! -d $STATUSDIR ] ; then
 mkdir -p $STATUSDIR
fi

if [ ! -f $STATUSDIR/$ADDRESS ] ; then
 `echo "0" > $STATUSDIR/$ADDRESS`
fi

EXPECTEDCOUNT=`cat $STATUSDIR/$ADDRESS`

if ping -c 1 -w 5  "$ADDRESS" &>/dev/null ; then
 DOWNCOUNT=0
else
 DOWNCOUNT=$(($EXPECTEDCOUNT+1)) 
fi

echo "Expected count is :"$EXPECTEDCOUNT
echo "Down count is     :"$DOWNCOUNT

# Something has changed
if [ ! $EXPECTEDCOUNT = $DOWNCOUNT ] ; then
 if [ $DOWNCOUNT = 0 ] ; then
  STATUS="UP"
 else
  STATUS="DOWN"
 fi
 
 MSG="vpn-link: "$ADDRESS" is "$STATUS" (count="$DOWNCOUNT")" 
 logger $MSG

 # if the change was to 0 or TIMEOUT then trigger an email
 if [ $DOWNCOUNT = 0 -o $DOWNCOUNT = $TIMEOUT ] ; then
  
  # If the expected count has not reached the timeout setting then
  # we dont want to send email 
  EXPECTEDCOUNT=$(($EXPECTEDCOUNT+1))
  if [ $EXPECTEDCOUNT -ge $TIMEOUT -a -n "$EMAIL" ] ; then
   echo $ADDRESS" is "$STATUS": Sending email" 
   mail -s "$MSG" $EMAIL < /dev/null  > /dev/null
  fi
 fi
 echo $DOWNCOUNT > $STATUSDIR/$ADDRESS
fi
    To use it simply place the script in a convenient location such as /usr/sbin and create an entry in your system crontab like this;

*  *    * * *    brett    /usr/sbin/vpn-mon 192.168.1.2 brett@example.com  >> /dev/null 2>&1


    This will ping test to 192.168.1.2 as user "brett" every minute and send email alerts to brett@example.com

    By default the script will send an email after the test has failed 5 consecutive times. This can be changed by editing the script and changing the TIMEOUT variable.

No comments: