I have certain infrastructure monitored by the parent company’s Nagios environment, and as such I’m not well versed in its setup or configuration. However I’ve recently been receiving notifications for hosts where the Host Name does not match it’s actual defined values.
For example, I’ll receive an email stating:
Host: Office #2 (server #1)
where I would normally expect it to display “Office #1”.
This led me down a path of learning a bit about Nagios.
First I browsed through the host monitoring to where the email notifications are displayed. Here I was able to determine the command used to populate the notification body:
/usr/bin/printf "%b" "** Nagios **\n\nAlert type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$ ($HOSTALIAS$)\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" |
The bolded $HOSTNAME$ there is where the incorrect data was coming from. Google tells me this is set in the host definition config file, which in my environment was located on the Nagios server here:
/opt/nagios/etc/hosts-standard.cfg
Finding my server definition in that file showed that it was entered correctly.
I got lucky with more google search terms and came across this link.
It appears that Nagios uses a “retention.dat” file which is effectively caching old values, and this file is referenced during notifications.
This file was found here in my environment:
/var/local/nagios-3.2.3/retention.dat
I’ve asked my Nagios administrator to update this file, and I’ll update this post if it proves to be successful.