[sheepdog-users] Several difficulties with sheepdog (from 0.4.0-0+tek2b-10 deb package)

Bastian Scholz ScholzB at T-Online.de
Thu Jul 26 16:21:03 CEST 2012

Hi David,

at the moment I shut down the complete cluster before
updating it (collie cluster shutdown)

But I had some complete data losses too with the actual
debian package. At the moment I try to understand, why
this happens...



Am 2012-07-26 15:54, schrieb David Douard:
> Hi,
> I'm trying the latest deb package made by Jens, and I encounter
> problems: I cannot make the cluster accept IO.
> My main problem is that I find it very easy to loose my cluster; 
> almost
> every time I try to shutdown the cluster, it ends with a situation 
> where
> the cluster is corrupted (with  "Failed to read object 
> 805a6c0500000000
> No object found" kind of messages).
> I lost the data when I upgraded the deb packages for example, as I 
> use
> in this context a cssh session, so all nodes are upgraded at the same
> time, and the upgrade provoque a restart of the sheepdog service.
> Is this behavior somewhat expected, since I do not follow some kind 
> of
> "good practices"? What are theses good practices? Is sheepdog
> "compatible" with sysadmin automating tools like puppet or salt (that 
> do
> propagate changes to several nodes at a time)? How can I configure
> something like automatic shutdown on power outage (I'm using 
> apcupsd)?
> How do I restart my cluster after a shutdown? Can I just fire a 
> "service
> sheepdog start" in a cssh session?
> I guess these questions are also somewhat related to the discussion
> about the possibility to sheepdog to detect that a node is down for a
> short while, and not really failed, etc.
> So, how do you guys manage your sheepdog clusters so you don't loose
> your data?

