(2 comments)

This post will be based on the official Zabbix's wiki entry. However, I'm gonna take another approach here, which is to not use the sudo command by the Zabbix.

1) Create script which will be regually polling SMART data from our disk by the cron. Lets call it smart_poller.sh

#!/bin/bash

source /etc/profile

SMARTDIR="/usr/local/tmp/"
FILE="smart_data.txt"
DISK=""

if [ $# -ne 1 ]; then
    echo "Usage: $0 <device>"
    exit 0
fi

if [ ! -e ${SMARTDIR} ]; then
    mkdir ${SMARTDIR}
fi

DISK=`echo $1|cut -d "/" -f3`

# dump all of the SMART values.
smartctl -A $1 1> ${SMARTDIR}/${DISK}_${FILE} 2>/dev/null

2) Move the script into the /usr/local/bin dir and make it executable:

mv smart_poller.sh /usr/local/bin/
chmod +x /usr/local/bin/smart_poller.sh

3) From the root user, create appropriate CRONTAB entry for our queries. We will be polling the data every 5 mins.

*/5 * * * * /usr/local/bin/smart_poller.sh /dev/sda

4) Now let's tell Zabbix to fetch the stored data in the ${SMARTDIR}. Edit the zabbix-agentd.conf file and at it's end put the txt like below:

UserParameter=hdd.sda.temperature,awk '/Temperature_Celsius/ {print $10}' \ 
/usr/local/tmp/sda_smart_data.txt
UserParameter=hdd.sda.reallocated_sector_count,awk '/Reallocated_Sector_Ct/ {print $10}' \ 
/usr/local/tmp/sda_smart_data.txt
UserParameter=hdd.sda.reported_uncorrect,awk '/Reported_Uncorrect/ {print $10}' \ 
/usr/local/tmp/sda_smart_data.txt
UserParameter=hdd.sda.offline_uncorrectable,awk '/Offline_Uncorrectable/ {print $10}' \ 
/usr/local/tmp/sda_smart_data.txt
UserParameter=hdd.sda.current_pending_sector,awk '/Current_Pending_Sector/ {print $10}' \ 
/usr/local/tmp/sda_smart_data.txt

5) Ok, now it is test time. From the Zabbix server execute:

$ zabbix_get -s ip_or_hostname -k hdd.sda.temperature
41
$

Yes - success! You can now configure Zabbix's server to make use of the shared SMART data. To do so, first create an itemfor the specific monitored host, just like below:

Name: HDD SDA Temperature
Type: Zabbix agent
Key: hdd.sda.temperature
Type of info: Numeric
Data type: Decimal
Units: °C
Update interval: 300

For each monitored parameter you should create a specific item.
Next we shall create triggers to get notified if something is happening unexpected and possible deadly for our data...
Trigger's example:

Name: HDD SDA Temperature TRIGGER
Expression: {Srv:hdd.sda.temperature.last(0,300)}>45
Severity: Average
Currently unrated

Comments

Marco 4 years, 2 months ago

Thank you

Very useful and work like a charm!

Link | Reply
Currently unrated

mescanef 4 years, 2 months ago

Hi, Marco, thanx for reading! :-)

regards,
mescanef

Link | Reply
Currently unrated

New Comment

required

required (not published)

optional