[Sheepdog] Configuring simple cluster on CentOS 5.5 x86_64

Thu Oct 21 16:33:24 CEST 2010

Someone on the list was working on a fuse sheepdog daemon that exported 
the vdi's via the fs.

Regards
-steve

On 10/21/2010 07:12 AM, Yuriy Kohut wrote:
> Hi,
>
> Is there a way translate Sheepdog VDIs into block devices on the boxes which belong to the cluster without using iSCSI ?
> ---
> Yura
>
> On Oct 20, 2010, at 2:23 PM, Yuriy Kohut wrote:
>
>> One quick question.
>> Is there any way to save/store created targets and logicalunits with 'tgtadm' ?
>> They all are "lost" after machine reboot.
>>
>> ---
>> Yura
>>
>> On Oct 19, 2010, at 6:40 PM, Yuriy Kohut wrote:
>>
>>> Got it working.
>>>
>>> The next step is to try all that on real hardware 3 node cluster.
>>>
>>>
>>> Thank you for help.
>>> ---
>>> Yura
>>>
>>> On Oct 19, 2010, at 3:02 PM, MORITA Kazutaka wrote:
>>>
>>>> Hi,
>>>>
>>>> Your sheep.log says
>>>>
>>>> Oct 19 05:59:06 send_message(169) failed to send message, 2
>>>>
>>>> This means that the sheep daemon failed to communicate with corosync.
>>>> Unfortunatelly, I've never seen such an error...
>>>>
>>>> Try following things:
>>>> - restart corosync daemon
>>>> - disable iptable and restart corosync
>>>> - disable selinux and restart corosync
>>>>
>>>> Did sheepdog work fine when you tested it on debian?
>>>>
>>>> Thanks,
>>>>
>>>> Kazutaka
>>>>
>>>> On 2010/10/19 19:56, Yuriy Kohut wrote:
>>>>> Attached.
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>> Please feel free to kick me if anything else required.
>>>>>
>>>>> ---
>>>>> Yura
>>>>>
>>>>> On Oct 19, 2010, at 1:45 PM, MORITA Kazutaka wrote:
>>>>>
>>>>>> Could you send me a sheep.log in the store directory?
>>>>>> It would be helpful for debugging.
>>>>>>
>>>>>> Kazutaka
>>>>>>
>>>>>> On 2010/10/19 19:16, Yuriy Kohut wrote:
>>>>>>> The patch doesn't help.
>>>>>>>
>>>>>>> Probably I'm doing something wrong. but the following operation won't finish:
>>>>>>> # tgtadm --op new --mode logicalunit --tid 1 --lun 1 -b test0 --bstype sheepdog
>>>>>>>
>>>>>>>
>>>>>>> Attached please find the the operation/command strace log archived:
>>>>>>> strace.log.tar.gz
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Please advise.
>>>>>>>
>>>>>>> Thank you
>>>>>>> ---
>>>>>>> Yura
>>>>>>>
>>>>>>> On Oct 19, 2010, at 11:52 AM, Yuriy Kohut wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Sure. I'll let you know results.
>>>>>>>>
>>>>>>>> Thank you.
>>>>>>>> ---
>>>>>>>> Yura
>>>>>>>>
>>>>>>>> On Oct 19, 2010, at 11:46 AM, MORITA Kazutaka wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> At Fri, 15 Oct 2010 17:33:18 +0300,
>>>>>>>>> Yuriy Kohut wrote:
>>>>>>>>>> One more new issue with TGTd.
>>>>>>>>>>
>>>>>>>>>> Initially we have one sheepdog vdi (on which we would like to create iscsi unit) and no tgt targets/units:
>>>>>>>>>> [root at centos ~]# tgtadm --op show --mode target
>>>>>>>>>> [root at centos ~]# collie vdi list
>>>>>>>>>> name        id    size    used  shared    creation time   vdi id
>>>>>>>>>> ------------------------------------------------------------------
>>>>>>>>>> test0        1  4.0 GB  4.0 GB  0.0 MB 2010-10-15 17:55   fd34af
>>>>>>>>>> [root at centos ~]#
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Creating new target:
>>>>>>>>>> [root at centos ~]# tgtadm --op new --mode target --tid 1 -T some.vps:disk0
>>>>>>>>>> [root at centos ~]# tgtadm --op show --mode target
>>>>>>>>>> Target 1: some.vps:disk0
>>>>>>>>>> System information:
>>>>>>>>>>    Driver: iscsi
>>>>>>>>>>    State: ready
>>>>>>>>>> I_T nexus information:
>>>>>>>>>> LUN information:
>>>>>>>>>>    LUN: 0
>>>>>>>>>>        Type: controller
>>>>>>>>>>        SCSI ID: IET     00010000
>>>>>>>>>>        SCSI SN: beaf10
>>>>>>>>>>        Size: 0 MB
>>>>>>>>>>        Online: Yes
>>>>>>>>>>        Removable media: No
>>>>>>>>>>        Readonly: No
>>>>>>>>>>        Backing store type: null
>>>>>>>>>>        Backing store path: None
>>>>>>>>>>        Backing store flags:
>>>>>>>>>> Account information:
>>>>>>>>>> ACL information:
>>>>>>>>>> [root at centos ~]#
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Try to create new logicalunit on existing tgt target and sheepdog vdi:
>>>>>>>>>> [root at centos ~]# tgtadm --op new --mode logicalunit --tid 1 --lun 1 -b test0 --bstype sheepdog
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> But the process never ends.
>>>>>>>>>> Please advise ...
>>>>>>>>> Thanks for your report.
>>>>>>>>>
>>>>>>>>> Can you try the following patch I sent minutes ago?
>>>>>>>>> http://lists.wpkg.org/pipermail/sheepdog/2010-October/000741.html
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Kazutaka
>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> Yura
>>>>>>>>>>
>>>>>>>>>> On Oct 15, 2010, at 4:55 PM, Yuriy Kohut wrote:
>>>>>>>>>>
>>>>>>>>>>> Cool, that works.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> ---
>>>>>>>>>>> Yura
>>>>>>>>>>>
>>>>>>>>>>> On Oct 15, 2010, at 3:52 PM, MORITA Kazutaka wrote:
>>>>>>>>>>>
>>>>>>>>>>>> At Fri, 15 Oct 2010 13:38:16 +0300,
>>>>>>>>>>>> Yuriy Kohut wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm using the following 'Getting Started' guide to configure simple cluster:
>>>>>>>>>>>>> http://www.osrg.net/sheepdog/usage.html
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have configured cluster on 1 node/box, so the first questions are:
>>>>>>>>>>>>> Can I configure cluster on single node (1 box) under CentOS 5.5 x86_64 ?
>>>>>>>>>>>>> Is it required at least 3 nodes/boxes ... ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have faced with the following issue on my single-node cluster. I have rebooted the box after my first image creation. The following done for that:
>>>>>>>>>>>>> - corosync is up and running
>>>>>>>>>>>>> udp        0      0 192.168.128.195:5404        0.0.0.0:*                               3541/corosync
>>>>>>>>>>>>> udp        0      0 192.168.128.195:5405        0.0.0.0:*                               3541/corosync
>>>>>>>>>>>>> udp        0      0 226.94.1.1:5405             0.0.0.0:*                               3541/corosync
>>>>>>>>>>>>>
>>>>>>>>>>>>> - sheep is up and running
>>>>>>>>>>>>> tcp        0      0 0.0.0.0:7000                0.0.0.0:*                   LISTEN      3561/sheep
>>>>>>>>>>>>>
>>>>>>>>>>>>> - cluster is formatted with 1 copy only
>>>>>>>>>>>>> #collie cluster format --copies=1
>>>>>>>>>>>>>
>>>>>>>>>>>>> - the image with prelocated data is created
>>>>>>>>>>>>> # qemu-img create sheepdog:test0 -o preallocation=data 4G
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> So after such siple steps I got:
>>>>>>>>>>>>> # collie vdi list
>>>>>>>>>>>>> name        id    size    used  shared    creation time   vdi id
>>>>>>>>>>>>> ------------------------------------------------------------------
>>>>>>>>>>>>> test0        1  4.0 GB  4.0 GB  0.0 MB 2010-10-15 12:42   fd34af
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Then I rebooted the box, and no image(s) are available for me after box came back. The vdi list just show nothing:
>>>>>>>>>>>>> # collie vdi list
>>>>>>>>>>>>> name        id    size    used  shared    creation time   vdi id
>>>>>>>>>>>>> ------------------------------------------------------------------
>>>>>>>>>>>>>
>>>>>>>>>>>>> and 'collie vdi list' never ends ...
>>>>>>>>>>>>> corosync and sheep are still running.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could somebody assist me with that.
>>>>>>>>>>>> Sorry, the following patch will fixes the problem.
>>>>>>>>>>>>
>>>>>>>>>>>> =
>>>>>>>>>>>> From: MORITA Kazutaka<morita.kazutaka at lab.ntt.co.jp>
>>>>>>>>>>>> Subject: [PATCH] sheep: call start_recovery when cluster restarts with one node
>>>>>>>>>>>>
>>>>>>>>>>>> Sheepdog recovers objects before starting a storage service, and the
>>>>>>>>>>>> routine is called when nodes are joined.  However If sheepdog consists
>>>>>>>>>>>> of only one node, no node doesn't send join messages, so
>>>>>>>>>>>> start_recovery doesn't called.  This patch fixes the problem.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: MORITA Kazutaka<morita.kazutaka at lab.ntt.co.jp>
>>>>>>>>>>>> ---
>>>>>>>>>>>> sheep/group.c |    3 +++
>>>>>>>>>>>> 1 files changed, 3 insertions(+), 0 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/sheep/group.c b/sheep/group.c
>>>>>>>>>>>> index ba8cdfb..86cbdb8 100644
>>>>>>>>>>>> --- a/sheep/group.c
>>>>>>>>>>>> +++ b/sheep/group.c
>>>>>>>>>>>> @@ -1226,6 +1226,9 @@ static void __sd_confchg_done(struct cpg_event *cevent)
>>>>>>>>>>>>
>>>>>>>>>>>>              update_cluster_info(&msg);
>>>>>>>>>>>>
>>>>>>>>>>>> +            if (sys->status == SD_STATUS_OK) /* sheepdog starts with one node */
>>>>>>>>>>>> +                    start_recovery(sys->epoch, NULL, 0);
>>>>>>>>>>>> +
>>>>>>>>>>>>              return;
>>>>>>>>>>>>      }
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> 1.5.6.5
>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> sheepdog mailing list
>>>>>>>>>>> sheepdog at lists.wpkg.org
>>>>>>>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>>>>>>>>> --
>>>>>>>>>> sheepdog mailing list
>>>>>>>>>> sheepdog at lists.wpkg.org
>>>>>>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>>>>>>> --
>>>>>>>> sheepdog mailing list
>>>>>>>> sheepdog at lists.wpkg.org
>>>>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>>>>
>>>
>>
>> --
>> sheepdog mailing list
>> sheepdog at lists.wpkg.org
>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>