[sheepdog] zookeeper quitting unexpectedly causes recovering from journal file failed
Liu Yuan
namei.unix at gmail.com
Thu May 23 07:37:06 CEST 2013
On 05/23/2013 01:29 PM, Hongyi Wang wrote:
> Hi,
>
> This is followed by our last test. One sheep node in our cluster
> connected zookeeper timeout so we tried to restart sheep on the node.
> However, the sheep cannot be started successfully,
> I am not sure if zk connection timeout could somehow causes recovering
> from journal file failed? Is this a bug of journal replay?
I guess it is a bug of journal replay for some corner cases. Pass 'skip'
for -j or simply remove files in journal dir will start the sheep again.
Thanks,
Yuan
More information about the sheepdog
mailing list