[Sheepdog] Configuring simple cluster on CentOS 5.5 x86_64
Yuriy Kohut
ykohut at onapp.com
Tue Oct 19 17:40:47 CEST 2010
Got it working.
The next step is to try all that on real hardware 3 node cluster.
Thank you for help.
---
Yura
On Oct 19, 2010, at 3:02 PM, MORITA Kazutaka wrote:
> Hi,
>
> Your sheep.log says
>
> Oct 19 05:59:06 send_message(169) failed to send message, 2
>
> This means that the sheep daemon failed to communicate with corosync.
> Unfortunatelly, I've never seen such an error...
>
> Try following things:
> - restart corosync daemon
> - disable iptable and restart corosync
> - disable selinux and restart corosync
>
> Did sheepdog work fine when you tested it on debian?
>
> Thanks,
>
> Kazutaka
>
> On 2010/10/19 19:56, Yuriy Kohut wrote:
>> Attached.
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>> Please feel free to kick me if anything else required.
>>
>> ---
>> Yura
>>
>> On Oct 19, 2010, at 1:45 PM, MORITA Kazutaka wrote:
>>
>>> Could you send me a sheep.log in the store directory?
>>> It would be helpful for debugging.
>>>
>>> Kazutaka
>>>
>>> On 2010/10/19 19:16, Yuriy Kohut wrote:
>>>> The patch doesn't help.
>>>>
>>>> Probably I'm doing something wrong. but the following operation won't finish:
>>>> # tgtadm --op new --mode logicalunit --tid 1 --lun 1 -b test0 --bstype sheepdog
>>>>
>>>>
>>>> Attached please find the the operation/command strace log archived:
>>>> strace.log.tar.gz
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>>
>>>> Please advise.
>>>>
>>>> Thank you
>>>> ---
>>>> Yura
>>>>
>>>> On Oct 19, 2010, at 11:52 AM, Yuriy Kohut wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Sure. I'll let you know results.
>>>>>
>>>>> Thank you.
>>>>> ---
>>>>> Yura
>>>>>
>>>>> On Oct 19, 2010, at 11:46 AM, MORITA Kazutaka wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> At Fri, 15 Oct 2010 17:33:18 +0300,
>>>>>> Yuriy Kohut wrote:
>>>>>>> One more new issue with TGTd.
>>>>>>>
>>>>>>> Initially we have one sheepdog vdi (on which we would like to create iscsi unit) and no tgt targets/units:
>>>>>>> [root at centos ~]# tgtadm --op show --mode target
>>>>>>> [root at centos ~]# collie vdi list
>>>>>>> name id size used shared creation time vdi id
>>>>>>> ------------------------------------------------------------------
>>>>>>> test0 1 4.0 GB 4.0 GB 0.0 MB 2010-10-15 17:55 fd34af
>>>>>>> [root at centos ~]#
>>>>>>>
>>>>>>>
>>>>>>> Creating new target:
>>>>>>> [root at centos ~]# tgtadm --op new --mode target --tid 1 -T some.vps:disk0
>>>>>>> [root at centos ~]# tgtadm --op show --mode target
>>>>>>> Target 1: some.vps:disk0
>>>>>>> System information:
>>>>>>> Driver: iscsi
>>>>>>> State: ready
>>>>>>> I_T nexus information:
>>>>>>> LUN information:
>>>>>>> LUN: 0
>>>>>>> Type: controller
>>>>>>> SCSI ID: IET 00010000
>>>>>>> SCSI SN: beaf10
>>>>>>> Size: 0 MB
>>>>>>> Online: Yes
>>>>>>> Removable media: No
>>>>>>> Readonly: No
>>>>>>> Backing store type: null
>>>>>>> Backing store path: None
>>>>>>> Backing store flags:
>>>>>>> Account information:
>>>>>>> ACL information:
>>>>>>> [root at centos ~]#
>>>>>>>
>>>>>>>
>>>>>>> Try to create new logicalunit on existing tgt target and sheepdog vdi:
>>>>>>> [root at centos ~]# tgtadm --op new --mode logicalunit --tid 1 --lun 1 -b test0 --bstype sheepdog
>>>>>>>
>>>>>>>
>>>>>>> But the process never ends.
>>>>>>> Please advise ...
>>>>>> Thanks for your report.
>>>>>>
>>>>>> Can you try the following patch I sent minutes ago?
>>>>>> http://lists.wpkg.org/pipermail/sheepdog/2010-October/000741.html
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Kazutaka
>>>>>>
>>>>>>> ---
>>>>>>> Yura
>>>>>>>
>>>>>>> On Oct 15, 2010, at 4:55 PM, Yuriy Kohut wrote:
>>>>>>>
>>>>>>>> Cool, that works.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> ---
>>>>>>>> Yura
>>>>>>>>
>>>>>>>> On Oct 15, 2010, at 3:52 PM, MORITA Kazutaka wrote:
>>>>>>>>
>>>>>>>>> At Fri, 15 Oct 2010 13:38:16 +0300,
>>>>>>>>> Yuriy Kohut wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm using the following 'Getting Started' guide to configure simple cluster:
>>>>>>>>>> http://www.osrg.net/sheepdog/usage.html
>>>>>>>>>>
>>>>>>>>>> I have configured cluster on 1 node/box, so the first questions are:
>>>>>>>>>> Can I configure cluster on single node (1 box) under CentOS 5.5 x86_64 ?
>>>>>>>>>> Is it required at least 3 nodes/boxes ... ?
>>>>>>>>>>
>>>>>>>>>> I have faced with the following issue on my single-node cluster. I have rebooted the box after my first image creation. The following done for that:
>>>>>>>>>> - corosync is up and running
>>>>>>>>>> udp 0 0 192.168.128.195:5404 0.0.0.0:* 3541/corosync
>>>>>>>>>> udp 0 0 192.168.128.195:5405 0.0.0.0:* 3541/corosync
>>>>>>>>>> udp 0 0 226.94.1.1:5405 0.0.0.0:* 3541/corosync
>>>>>>>>>>
>>>>>>>>>> - sheep is up and running
>>>>>>>>>> tcp 0 0 0.0.0.0:7000 0.0.0.0:* LISTEN 3561/sheep
>>>>>>>>>>
>>>>>>>>>> - cluster is formatted with 1 copy only
>>>>>>>>>> #collie cluster format --copies=1
>>>>>>>>>>
>>>>>>>>>> - the image with prelocated data is created
>>>>>>>>>> # qemu-img create sheepdog:test0 -o preallocation=data 4G
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So after such siple steps I got:
>>>>>>>>>> # collie vdi list
>>>>>>>>>> name id size used shared creation time vdi id
>>>>>>>>>> ------------------------------------------------------------------
>>>>>>>>>> test0 1 4.0 GB 4.0 GB 0.0 MB 2010-10-15 12:42 fd34af
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Then I rebooted the box, and no image(s) are available for me after box came back. The vdi list just show nothing:
>>>>>>>>>> # collie vdi list
>>>>>>>>>> name id size used shared creation time vdi id
>>>>>>>>>> ------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>> and 'collie vdi list' never ends ...
>>>>>>>>>> corosync and sheep are still running.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Could somebody assist me with that.
>>>>>>>>> Sorry, the following patch will fixes the problem.
>>>>>>>>>
>>>>>>>>> =
>>>>>>>>> From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
>>>>>>>>> Subject: [PATCH] sheep: call start_recovery when cluster restarts with one node
>>>>>>>>>
>>>>>>>>> Sheepdog recovers objects before starting a storage service, and the
>>>>>>>>> routine is called when nodes are joined. However If sheepdog consists
>>>>>>>>> of only one node, no node doesn't send join messages, so
>>>>>>>>> start_recovery doesn't called. This patch fixes the problem.
>>>>>>>>>
>>>>>>>>> Signed-off-by: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
>>>>>>>>> ---
>>>>>>>>> sheep/group.c | 3 +++
>>>>>>>>> 1 files changed, 3 insertions(+), 0 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/sheep/group.c b/sheep/group.c
>>>>>>>>> index ba8cdfb..86cbdb8 100644
>>>>>>>>> --- a/sheep/group.c
>>>>>>>>> +++ b/sheep/group.c
>>>>>>>>> @@ -1226,6 +1226,9 @@ static void __sd_confchg_done(struct cpg_event *cevent)
>>>>>>>>>
>>>>>>>>> update_cluster_info(&msg);
>>>>>>>>>
>>>>>>>>> + if (sys->status == SD_STATUS_OK) /* sheepdog starts with one node */
>>>>>>>>> + start_recovery(sys->epoch, NULL, 0);
>>>>>>>>> +
>>>>>>>>> return;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> 1.5.6.5
>>>>>>>>>
>>>>>>>> --
>>>>>>>> sheepdog mailing list
>>>>>>>> sheepdog at lists.wpkg.org
>>>>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>>>>>> --
>>>>>>> sheepdog mailing list
>>>>>>> sheepdog at lists.wpkg.org
>>>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>>>> --
>>>>> sheepdog mailing list
>>>>> sheepdog at lists.wpkg.org
>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>
More information about the sheepdog
mailing list