[sheepdog-users] sheepdog cluster health monitoring

Tue Feb 11 19:01:24 CET 2014

2014-02-11 18:26 GMT+01:00 Maxim Terletskiy <terletskiy at emu.ru>:

> Is there any way to check cluster health status? Can I see how much
> objects currently is under goal of replication?

dog node recovery

is going to show the nodes that are receiving/rebuilding data.

E.g.

dog node recovery
Nodes In Recovery:
  Id   Host:Port         V-Nodes       Zone       Progress
   0   192.168.10.4:7000     107   67807424        1.0%
   1   192.168.10.5:7000     207   84584640        0.1%
   2   192.168.10.6:7000      97  101361856        0.6%
   3   192.168.10.7:7000     101  118139072        3.9%

> Maybe someone have scripts for nagios/zabbix?

For nagios I think this may work fine:

#!/bin/bash
rows=$(dog node recovery | wc -l)
if [ $rows -ne 2 ]
then
    echo "Cluster is recovering data"
    exit 1
fi

This way you'll get a "yellow" worning till the cluster is rebuilding.
You might have to deal with execution permissions tough.

Another check you may want to run is if sheep daemon is running or not.
This may be done in two different way:

1) as a nrpe plugin (installed the sheep nodes)

nano check_sheep
#!/bin/bash
pgrep sheep || exit 2

2) checking the service
create the command check_sheep
/usr/lib/nagios/plugins/check_tcp -H $HOST$ -p 7000

I would recommend to monitor also zookeeper

command check_zookeepr
/usr/lib/nagios/plugins/check_tcp -H $HOST$ -p 2181
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20140211/b4f0841b/attachment-0005.html>