[sheepdog] [PATCH v2 05/10] work: try to create worker threads in worker_thread_request_done

Wed May 15 07:37:35 CEST 2013

That doesn't sound much different, the SPOF is now just in the client
rather than one of the sheeps.
The chance of a single sheep failing is roughly the same as the collie
client failing.

I think having each node connect directly to backup storage would be a
big win though, rather than creating a star topology around the collie
client where collie has to download everything first.
Whether the backup is coordinated by collie or sheep doesn't seem that
important to me at this time as long as it can tolerate nodes failing
while the backup is running and ensure all objects do make their way
to the backup storage eventually.

On Tue, May 14, 2013 at 10:28 PM, Liu Yuan <namei.unix at gmail.com> wrote:
> On 05/14/2013 06:01 PM, Joseph Glanville wrote:
>> Aye. I was thinking something more separate though.
>>
>> The implementation I had in mind was something more like this:
>> - Only backs up readonly objects, if you want a VDI to be protected by
>> a backup create a snapshot of it before starting cluster backup
>> - Cluster backup is initiated and managed by a single node (master),
>> probably the node the command is issued on.
>> - Master first creates a copy of the VDI tree.
>> - Master instructs the sheep nodes to begin hashing any currently
>> unhashed readonly objects
>
> No, master will be the SPOF. We'll implement a client-server style
> mechanism that collie will connect to any sheep he is interested to grab
> all the read-only objects out of the cluster and store them in the farm.
>
> The performance can be optimized by:
>  1 multi-threading the transfer between collie and sheep
>  2 support multiple connections to multiple sheep.
>
> Collie will take care of node failures between transfer in a
> multi-connection scenario.
>
> Thanks,
> Yuan