[sheepdog] [PATCH 1/2] sheep: get vdi bitmap from all the nodes
Kai Zhang
kyle at zelin.io
Thu Jul 11 06:27:09 CEST 2013
On Jul 11, 2013, at 12:23 PM, MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> wrote:
> At Thu, 11 Jul 2013 11:07:34 +0800,
> Kai Zhang wrote:
>>
>>
>> On Jul 10, 2013, at 4:50 PM, morita.kazutaka at gmail.com wrote:
>>
>>> From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
>>>
>>> Currently, we assume that the existing nodes has the complete vdi
>>> bitmap and reading from one of them is enough. However, this
>>> assumption is wrong if the joining nodes have a vdi object which is
>>> not in the running cluster. For example,
>>>
>>> 1. Sheepdog is running with one node A
>>>
>>> 2. Two node B and C joins Sheepdog at the same time, and the node B
>>> has a vdi object which is not in the node A.
>>>
>>> 3. If C calls get_vdi_from() against A before A does it against B, C
>>> cannot have the vdi object in its vdi bitmap.
>>>
>>> The safe and simple approach to fix this problem is:
>>>
>>> - The newly joined node calls get_vdi_from() against all the existing
>>> nodes.
>>> - The existing node calls get_vdi_from() only against the newly
>>> joined node.
>>>
>>> We can optimize it, but I think there is no simple way to do it. I
>>> left it as a TODO in the source code.
>>>
>>> Signed-off-by: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
>>> ---
>>> sheep/group.c | 31 ++++++++++++++++++-------------
>>> 1 file changed, 18 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/sheep/group.c b/sheep/group.c
>>> index 370c625..2f52d0d 100644
>>> --- a/sheep/group.c
>>> +++ b/sheep/group.c
>>> @@ -545,12 +545,13 @@ static void do_get_vdis(struct work *work)
>>> int i, ret;
>>>
>>> if (!node_is_local(&w->joined)) {
>>> - switch (sys->status) {
>>> - case SD_STATUS_OK:
>>> - case SD_STATUS_HALT:
>>> - get_vdis_from(&w->joined);
>>> - return;
>>> - }
>>> + sd_dprintf("try to get vdi bitmap from %s",
>>> + node_to_str(&w->joined));
>>> + ret = get_vdis_from(&w->joined);
>>> + if (ret != SD_RES_SUCCESS)
>>> + sd_printf(SDOG_ALERT, "failed to get vdi bitmap from "
>>> + "%s", node_to_str(&w->joined));
>>> + return;
>>> }
>>
>> Could you please explain why we need to remove this switch-case?
>> Or why we need this switch-case before?
>
> The old code calls get_vdis() after sheepdog cluster starts up. If
> the cluster status is not OK or HALT, the existing node doesn't have a
> complete vdi bitmap.
>
> However, this patches calls get_vdis() even before sheepdog starts, so
> the existing node has the a complete bitmap and reading only from the
> joining node is always okay.
>
> With this patch, the rule we must follow is only the following two:
>
>>> - The newly joined node calls get_vdi_from() against all the existing
>>> nodes.
>>> - The existing node calls get_vdi_from() only against the newly
>>> joined node.
>
Thanks for your explanation. The rule looks simple and clean.
Thanks,
Kyle
More information about the sheepdog
mailing list