[sheepdog-users] Unexpeted freeze of sheep on one node

Maxim Terletskiy terletskiy at emu.ru
Thu Nov 20 07:30:44 CET 2014


I've had similar problems.

In such cases good test will be iostat:
iostat -dx 5 /dev/sd[a-z]

If util or await on some disk/disks is abnormally high without any real disk usage on this server, then this is the problem disk. One of such disk(by the way WD) was mocking me almost a week before unplug.

> I notice such behavior in old disks: they work but they are much
> slower. No sudden death though.
> They are not very very old and they are all Western Digital
>
> root at sheepdog004:~# smartctl -A /dev/sda | grep -i hours
>    9 Power_On_Hours          0x0032   099   098   000    Old_age
> Always       -       1247
> root at sheepdog004:~# smartctl -A /dev/sdb | grep -i hours
>    9 Power_On_Hours          0x0032   075   075   000    Old_age
> Always       -       18299
> root at sheepdog004:~# smartctl -A /dev/sdc | grep -i hours
>    9 Power_On_Hours          0x0032   075   075   000    Old_age
> Always       -       18283
> root at sheepdog004:~# smartctl -A /dev/sdd | grep -i hours
>    9 Power_On_Hours          0x0032   079   079   000    Old_age
> Always       -       15689
>
>> You could setup a NC on port 7000 en try to flush some data to it from another node to see if there could be something wrong in a router/switch specific to that port.
> I'll do it as soon as the smart test is done
>
>> But aside to that flushing the node and add it as a fresh node seems the most logical next step to me.
> Thank you very much for the support and for sharing your experience.




More information about the sheepdog-users mailing list