[sheepdog-users] Sheepdog 0.9 missing live migration feature

Wed May 13 12:04:30 CEST 2015

On Tue, May 12, 2015 at 01:09:49PM +0200, Walid Moghrabi wrote:
> Hi,
> 
> Just tried 0.9.2_rc0 and working as expected !
> Live migration between nodes is working again and live migration between storage is working too !
> So far, I didn't encounter any problem with this release.
> 
> >I'd suggest MD + Object Cache + Dual NICs. Since you make use of Object cache,
> >no need to open '-n'. Basically, you might take following as an example:
> 
> > #-w
> > #256G is just an placeholder, you can adjust on your own. If you found
> > #performance is not good enough, you can try turn off 'directio', then object
> > #cache code will take advantage of memory as the cache tier. But this might
> > #require you to tune some kernel memory flush settings for smooth performance.
> 
> > #/meta should be put on a raid since it is single point of failure. MD will take
> > #care of your disk1,disk2,disk3. The "--directio" in the rear means don't use
> > #memory for backend store. '-n' would be helpful if you find overall performance
> > #sometimes drops down. '-n' in this case, will affect the performance of object
> > #cache when it is doing flush-back of the dirty data.
> 
> > #-c
> > #for cluster driver, I'd suggest zookeeper
> 
> > sheep -w size=256G,dir=/path/to/ssd,directio -i 'nic ip for IO' -y 'your main nic ip' \
> >       /meta,/disk1/disk2,/disk3 -c xxx --directio
> 
> Ok, I'll try with these settings, I just have a few questions :
> you say that /meta should be located on a RAID device because it is a SPOF ... does that mean that if /meta crashes for some reason, the whole node is crashed ?
> If so, can I rely on Sheepdog's redundancy ? I mean, if I loose one node in my 9 node cluster, that shouldn't be a problem right ? So, I don't really see any problem at leaving this in a SPOF (my mean is that I was thinking in leaving this on the dedicated SSD which is not in a RAID configuration).

Yes, you can rely on sheepdog's redundancy for sure.

> If I understand well, directio performances depends on the underlying physical storage so on a decent SSD, this could give good results right ?

It depends, but you can try and test the option with or without.

> I don't really understand the "-n" thing ... can you explain me what it does and in which case it is recommended to enable it ?

It drops 'O_SYNC' for openning the file, so your data might survice the power
outage in your data center, but most of the time, it won't courrupt your data
because of redundancy provided by sheepdog. I'd recommend you enable it if you
can see much improvement in performance.

> Last, I'm still asking myself which settings would be the best for cluster formating using Erasure Code. I have 9 nodes and I'd like to find the good balance between performances, capacity and security.
> As I understand, in the x:y tuple, I can't have less than x nodes alive for the cluster to remain functionnal and I can lose y nodes at the same time with no data loss and no downtime ... right ?
> Documentation is not very clear concerning x ... there (https://github.com/sheepdog/sheepdog/wiki/Erasure-Code-Support) it is written that x must be in the 2,4,8,16 range (so, power of 2) but there (http://www.sheepdog-project.org/doc/redundancy_level.html), it is written that it can be a multiply of 2 (2,4,6,8,10,12,...) ... Which one is right ?

power of 2 is right because the x must be divisor of 4M.

> For my concern, I was hesitating between these settings :
> - 4:2 because I can run with very few nodes but I can lose only 2 nodes and the ratio is 0.5 which seems a good balance (only 1.5x the vdi size as real storage usage and recovery divided by 2)
> - 6:3 which would give me the same ratio but would give me the ability to lose 3 nodes (but I'll need to have 2/3 of my cluster still alive so in that case, it just fits my current configuration)

6:3 isn't a valid tuple so you can only choose 4:2, which I think is reasonable
for your setup.

Thanks,
Yuan