[Sheepdog] Configuring simple cluster on CentOS 5.5 x86_64

Yuriy Kohut ykohut at onapp.com
Thu Oct 21 16:12:41 CEST 2010


Hi,

Is there a way translate Sheepdog VDIs into block devices on the boxes which belong to the cluster without using iSCSI ?
---
Yura

On Oct 20, 2010, at 2:23 PM, Yuriy Kohut wrote:

> One quick question.
> Is there any way to save/store created targets and logicalunits with 'tgtadm' ?
> They all are "lost" after machine reboot.
> 
> ---
> Yura
> 
> On Oct 19, 2010, at 6:40 PM, Yuriy Kohut wrote:
> 
>> Got it working.
>> 
>> The next step is to try all that on real hardware 3 node cluster.
>> 
>> 
>> Thank you for help.
>> ---
>> Yura
>> 
>> On Oct 19, 2010, at 3:02 PM, MORITA Kazutaka wrote:
>> 
>>> Hi,
>>> 
>>> Your sheep.log says
>>> 
>>> Oct 19 05:59:06 send_message(169) failed to send message, 2
>>> 
>>> This means that the sheep daemon failed to communicate with corosync.
>>> Unfortunatelly, I've never seen such an error...
>>> 
>>> Try following things:
>>> - restart corosync daemon
>>> - disable iptable and restart corosync
>>> - disable selinux and restart corosync
>>> 
>>> Did sheepdog work fine when you tested it on debian?
>>> 
>>> Thanks,
>>> 
>>> Kazutaka
>>> 
>>> On 2010/10/19 19:56, Yuriy Kohut wrote:
>>>> Attached.
>>>> 
>>>> 
>>>> 
>>>> ------------------------------------------------------------------------
>>>> 
>>>> 
>>>> 
>>>> Please feel free to kick me if anything else required.
>>>> 
>>>> ---
>>>> Yura
>>>> 
>>>> On Oct 19, 2010, at 1:45 PM, MORITA Kazutaka wrote:
>>>> 
>>>>> Could you send me a sheep.log in the store directory?
>>>>> It would be helpful for debugging.
>>>>> 
>>>>> Kazutaka
>>>>> 
>>>>> On 2010/10/19 19:16, Yuriy Kohut wrote:
>>>>>> The patch doesn't help.
>>>>>> 
>>>>>> Probably I'm doing something wrong. but the following operation won't finish:
>>>>>> # tgtadm --op new --mode logicalunit --tid 1 --lun 1 -b test0 --bstype sheepdog
>>>>>> 
>>>>>> 
>>>>>> Attached please find the the operation/command strace log archived:
>>>>>> strace.log.tar.gz
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ------------------------------------------------------------------------
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Please advise.
>>>>>> 
>>>>>> Thank you
>>>>>> ---
>>>>>> Yura
>>>>>> 
>>>>>> On Oct 19, 2010, at 11:52 AM, Yuriy Kohut wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Sure. I'll let you know results.
>>>>>>> 
>>>>>>> Thank you.
>>>>>>> ---
>>>>>>> Yura
>>>>>>> 
>>>>>>> On Oct 19, 2010, at 11:46 AM, MORITA Kazutaka wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> At Fri, 15 Oct 2010 17:33:18 +0300,
>>>>>>>> Yuriy Kohut wrote:
>>>>>>>>> One more new issue with TGTd.
>>>>>>>>> 
>>>>>>>>> Initially we have one sheepdog vdi (on which we would like to create iscsi unit) and no tgt targets/units:
>>>>>>>>> [root at centos ~]# tgtadm --op show --mode target
>>>>>>>>> [root at centos ~]# collie vdi list
>>>>>>>>> name        id    size    used  shared    creation time   vdi id
>>>>>>>>> ------------------------------------------------------------------
>>>>>>>>> test0        1  4.0 GB  4.0 GB  0.0 MB 2010-10-15 17:55   fd34af
>>>>>>>>> [root at centos ~]#
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Creating new target:
>>>>>>>>> [root at centos ~]# tgtadm --op new --mode target --tid 1 -T some.vps:disk0
>>>>>>>>> [root at centos ~]# tgtadm --op show --mode target
>>>>>>>>> Target 1: some.vps:disk0
>>>>>>>>> System information:
>>>>>>>>>   Driver: iscsi
>>>>>>>>>   State: ready
>>>>>>>>> I_T nexus information:
>>>>>>>>> LUN information:
>>>>>>>>>   LUN: 0
>>>>>>>>>       Type: controller
>>>>>>>>>       SCSI ID: IET     00010000
>>>>>>>>>       SCSI SN: beaf10
>>>>>>>>>       Size: 0 MB
>>>>>>>>>       Online: Yes
>>>>>>>>>       Removable media: No
>>>>>>>>>       Readonly: No
>>>>>>>>>       Backing store type: null
>>>>>>>>>       Backing store path: None
>>>>>>>>>       Backing store flags:
>>>>>>>>> Account information:
>>>>>>>>> ACL information:
>>>>>>>>> [root at centos ~]#
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Try to create new logicalunit on existing tgt target and sheepdog vdi:
>>>>>>>>> [root at centos ~]# tgtadm --op new --mode logicalunit --tid 1 --lun 1 -b test0 --bstype sheepdog
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> But the process never ends.
>>>>>>>>> Please advise ...
>>>>>>>> Thanks for your report.
>>>>>>>> 
>>>>>>>> Can you try the following patch I sent minutes ago?
>>>>>>>> http://lists.wpkg.org/pipermail/sheepdog/2010-October/000741.html
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>> Kazutaka
>>>>>>>> 
>>>>>>>>> ---
>>>>>>>>> Yura
>>>>>>>>> 
>>>>>>>>> On Oct 15, 2010, at 4:55 PM, Yuriy Kohut wrote:
>>>>>>>>> 
>>>>>>>>>> Cool, that works.
>>>>>>>>>> 
>>>>>>>>>> Thanks
>>>>>>>>>> ---
>>>>>>>>>> Yura
>>>>>>>>>> 
>>>>>>>>>> On Oct 15, 2010, at 3:52 PM, MORITA Kazutaka wrote:
>>>>>>>>>> 
>>>>>>>>>>> At Fri, 15 Oct 2010 13:38:16 +0300,
>>>>>>>>>>> Yuriy Kohut wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> I'm using the following 'Getting Started' guide to configure simple cluster:
>>>>>>>>>>>> http://www.osrg.net/sheepdog/usage.html
>>>>>>>>>>>> 
>>>>>>>>>>>> I have configured cluster on 1 node/box, so the first questions are:
>>>>>>>>>>>> Can I configure cluster on single node (1 box) under CentOS 5.5 x86_64 ?
>>>>>>>>>>>> Is it required at least 3 nodes/boxes ... ?
>>>>>>>>>>>> 
>>>>>>>>>>>> I have faced with the following issue on my single-node cluster. I have rebooted the box after my first image creation. The following done for that:
>>>>>>>>>>>> - corosync is up and running
>>>>>>>>>>>> udp        0      0 192.168.128.195:5404        0.0.0.0:*                               3541/corosync
>>>>>>>>>>>> udp        0      0 192.168.128.195:5405        0.0.0.0:*                               3541/corosync
>>>>>>>>>>>> udp        0      0 226.94.1.1:5405             0.0.0.0:*                               3541/corosync
>>>>>>>>>>>> 
>>>>>>>>>>>> - sheep is up and running
>>>>>>>>>>>> tcp        0      0 0.0.0.0:7000                0.0.0.0:*                   LISTEN      3561/sheep
>>>>>>>>>>>> 
>>>>>>>>>>>> - cluster is formatted with 1 copy only
>>>>>>>>>>>> #collie cluster format --copies=1
>>>>>>>>>>>> 
>>>>>>>>>>>> - the image with prelocated data is created
>>>>>>>>>>>> # qemu-img create sheepdog:test0 -o preallocation=data 4G
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> So after such siple steps I got:
>>>>>>>>>>>> # collie vdi list
>>>>>>>>>>>> name        id    size    used  shared    creation time   vdi id
>>>>>>>>>>>> ------------------------------------------------------------------
>>>>>>>>>>>> test0        1  4.0 GB  4.0 GB  0.0 MB 2010-10-15 12:42   fd34af
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Then I rebooted the box, and no image(s) are available for me after box came back. The vdi list just show nothing:
>>>>>>>>>>>> # collie vdi list
>>>>>>>>>>>> name        id    size    used  shared    creation time   vdi id
>>>>>>>>>>>> ------------------------------------------------------------------
>>>>>>>>>>>> 
>>>>>>>>>>>> and 'collie vdi list' never ends ...
>>>>>>>>>>>> corosync and sheep are still running.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Could somebody assist me with that.
>>>>>>>>>>> Sorry, the following patch will fixes the problem.
>>>>>>>>>>> 
>>>>>>>>>>> =
>>>>>>>>>>> From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
>>>>>>>>>>> Subject: [PATCH] sheep: call start_recovery when cluster restarts with one node
>>>>>>>>>>> 
>>>>>>>>>>> Sheepdog recovers objects before starting a storage service, and the
>>>>>>>>>>> routine is called when nodes are joined.  However If sheepdog consists
>>>>>>>>>>> of only one node, no node doesn't send join messages, so
>>>>>>>>>>> start_recovery doesn't called.  This patch fixes the problem.
>>>>>>>>>>> 
>>>>>>>>>>> Signed-off-by: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
>>>>>>>>>>> ---
>>>>>>>>>>> sheep/group.c |    3 +++
>>>>>>>>>>> 1 files changed, 3 insertions(+), 0 deletions(-)
>>>>>>>>>>> 
>>>>>>>>>>> diff --git a/sheep/group.c b/sheep/group.c
>>>>>>>>>>> index ba8cdfb..86cbdb8 100644
>>>>>>>>>>> --- a/sheep/group.c
>>>>>>>>>>> +++ b/sheep/group.c
>>>>>>>>>>> @@ -1226,6 +1226,9 @@ static void __sd_confchg_done(struct cpg_event *cevent)
>>>>>>>>>>> 
>>>>>>>>>>>             update_cluster_info(&msg);
>>>>>>>>>>> 
>>>>>>>>>>> +            if (sys->status == SD_STATUS_OK) /* sheepdog starts with one node */
>>>>>>>>>>> +                    start_recovery(sys->epoch, NULL, 0);
>>>>>>>>>>> +
>>>>>>>>>>>             return;
>>>>>>>>>>>     }
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> 1.5.6.5
>>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> sheepdog mailing list
>>>>>>>>>> sheepdog at lists.wpkg.org
>>>>>>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>>>>>>>> --
>>>>>>>>> sheepdog mailing list
>>>>>>>>> sheepdog at lists.wpkg.org
>>>>>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>>>>>> --
>>>>>>> sheepdog mailing list
>>>>>>> sheepdog at lists.wpkg.org
>>>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>>> 
>> 
> 
> --
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog




More information about the sheepdog mailing list