[sheepdog] [PATCH 1/2] sheep: get vdi bitmap from all the nodes

Kai Zhang kyle at zelin.io
Thu Jul 11 06:27:09 CEST 2013


On Jul 11, 2013, at 12:23 PM, MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> wrote:

> At Thu, 11 Jul 2013 11:07:34 +0800,
> Kai Zhang wrote:
>> 
>> 
>> On Jul 10, 2013, at 4:50 PM, morita.kazutaka at gmail.com wrote:
>> 
>>> From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
>>> 
>>> Currently, we assume that the existing nodes has the complete vdi
>>> bitmap and reading from one of them is enough.  However, this
>>> assumption is wrong if the joining nodes have a vdi object which is
>>> not in the running cluster.  For example,
>>> 
>>> 1. Sheepdog is running with one node A
>>> 
>>> 2. Two node B and C joins Sheepdog at the same time, and the node B
>>>   has a vdi object which is not in the node A.
>>> 
>>> 3. If C calls get_vdi_from() against A before A does it against B, C
>>>   cannot have the vdi object in its vdi bitmap.
>>> 
>>> The safe and simple approach to fix this problem is:
>>> 
>>> - The newly joined node calls get_vdi_from() against all the existing
>>>  nodes.
>>> - The existing node calls get_vdi_from() only against the newly
>>>  joined node.
>>> 
>>> We can optimize it, but I think there is no simple way to do it.  I
>>> left it as a TODO in the source code.
>>> 
>>> Signed-off-by: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
>>> ---
>>> sheep/group.c |   31 ++++++++++++++++++-------------
>>> 1 file changed, 18 insertions(+), 13 deletions(-)
>>> 
>>> diff --git a/sheep/group.c b/sheep/group.c
>>> index 370c625..2f52d0d 100644
>>> --- a/sheep/group.c
>>> +++ b/sheep/group.c
>>> @@ -545,12 +545,13 @@ static void do_get_vdis(struct work *work)
>>> 	int i, ret;
>>> 
>>> 	if (!node_is_local(&w->joined)) {
>>> -		switch (sys->status) {
>>> -		case SD_STATUS_OK:
>>> -		case SD_STATUS_HALT:
>>> -			get_vdis_from(&w->joined);
>>> -			return;
>>> -		}
>>> +		sd_dprintf("try to get vdi bitmap from %s",
>>> +			   node_to_str(&w->joined));
>>> +		ret = get_vdis_from(&w->joined);
>>> +		if (ret != SD_RES_SUCCESS)
>>> +			sd_printf(SDOG_ALERT, "failed to get vdi bitmap from "
>>> +				  "%s", node_to_str(&w->joined));
>>> +		return;
>>> 	}
>> 
>> Could you please explain why we need to remove this switch-case?
>> Or why we need this switch-case before?
> 
> The old code calls get_vdis() after sheepdog cluster starts up.  If
> the cluster status is not OK or HALT, the existing node doesn't have a
> complete vdi bitmap.
> 
> However, this patches calls get_vdis() even before sheepdog starts, so
> the existing node has the a complete bitmap and reading only from the
> joining node is always okay.
> 
> With this patch, the rule we must follow is only the following two:
> 
>>> - The newly joined node calls get_vdi_from() against all the existing
>>>  nodes.
>>> - The existing node calls get_vdi_from() only against the newly
>>>  joined node.
> 

Thanks for your explanation. The rule looks simple and clean.

Thanks,
Kyle




More information about the sheepdog mailing list