Got it working. The next step is to try all that on real hardware 3 node cluster. Thank you for help. --- Yura On Oct 19, 2010, at 3:02 PM, MORITA Kazutaka wrote: > Hi, > > Your sheep.log says > > Oct 19 05:59:06 send_message(169) failed to send message, 2 > > This means that the sheep daemon failed to communicate with corosync. > Unfortunatelly, I've never seen such an error... > > Try following things: > - restart corosync daemon > - disable iptable and restart corosync > - disable selinux and restart corosync > > Did sheepdog work fine when you tested it on debian? > > Thanks, > > Kazutaka > > On 2010/10/19 19:56, Yuriy Kohut wrote: >> Attached. >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> Please feel free to kick me if anything else required. >> >> --- >> Yura >> >> On Oct 19, 2010, at 1:45 PM, MORITA Kazutaka wrote: >> >>> Could you send me a sheep.log in the store directory? >>> It would be helpful for debugging. >>> >>> Kazutaka >>> >>> On 2010/10/19 19:16, Yuriy Kohut wrote: >>>> The patch doesn't help. >>>> >>>> Probably I'm doing something wrong. but the following operation won't finish: >>>> # tgtadm --op new --mode logicalunit --tid 1 --lun 1 -b test0 --bstype sheepdog >>>> >>>> >>>> Attached please find the the operation/command strace log archived: >>>> strace.log.tar.gz >>>> >>>> >>>> >>>> ------------------------------------------------------------------------ >>>> >>>> >>>> >>>> Please advise. >>>> >>>> Thank you >>>> --- >>>> Yura >>>> >>>> On Oct 19, 2010, at 11:52 AM, Yuriy Kohut wrote: >>>> >>>>> Hi, >>>>> >>>>> Sure. I'll let you know results. >>>>> >>>>> Thank you. >>>>> --- >>>>> Yura >>>>> >>>>> On Oct 19, 2010, at 11:46 AM, MORITA Kazutaka wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> At Fri, 15 Oct 2010 17:33:18 +0300, >>>>>> Yuriy Kohut wrote: >>>>>>> One more new issue with TGTd. >>>>>>> >>>>>>> Initially we have one sheepdog vdi (on which we would like to create iscsi unit) and no tgt targets/units: >>>>>>> [root at centos ~]# tgtadm --op show --mode target >>>>>>> [root at centos ~]# collie vdi list >>>>>>> name id size used shared creation time vdi id >>>>>>> ------------------------------------------------------------------ >>>>>>> test0 1 4.0 GB 4.0 GB 0.0 MB 2010-10-15 17:55 fd34af >>>>>>> [root at centos ~]# >>>>>>> >>>>>>> >>>>>>> Creating new target: >>>>>>> [root at centos ~]# tgtadm --op new --mode target --tid 1 -T some.vps:disk0 >>>>>>> [root at centos ~]# tgtadm --op show --mode target >>>>>>> Target 1: some.vps:disk0 >>>>>>> System information: >>>>>>> Driver: iscsi >>>>>>> State: ready >>>>>>> I_T nexus information: >>>>>>> LUN information: >>>>>>> LUN: 0 >>>>>>> Type: controller >>>>>>> SCSI ID: IET 00010000 >>>>>>> SCSI SN: beaf10 >>>>>>> Size: 0 MB >>>>>>> Online: Yes >>>>>>> Removable media: No >>>>>>> Readonly: No >>>>>>> Backing store type: null >>>>>>> Backing store path: None >>>>>>> Backing store flags: >>>>>>> Account information: >>>>>>> ACL information: >>>>>>> [root at centos ~]# >>>>>>> >>>>>>> >>>>>>> Try to create new logicalunit on existing tgt target and sheepdog vdi: >>>>>>> [root at centos ~]# tgtadm --op new --mode logicalunit --tid 1 --lun 1 -b test0 --bstype sheepdog >>>>>>> >>>>>>> >>>>>>> But the process never ends. >>>>>>> Please advise ... >>>>>> Thanks for your report. >>>>>> >>>>>> Can you try the following patch I sent minutes ago? >>>>>> http://lists.wpkg.org/pipermail/sheepdog/2010-October/000741.html >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Kazutaka >>>>>> >>>>>>> --- >>>>>>> Yura >>>>>>> >>>>>>> On Oct 15, 2010, at 4:55 PM, Yuriy Kohut wrote: >>>>>>> >>>>>>>> Cool, that works. >>>>>>>> >>>>>>>> Thanks >>>>>>>> --- >>>>>>>> Yura >>>>>>>> >>>>>>>> On Oct 15, 2010, at 3:52 PM, MORITA Kazutaka wrote: >>>>>>>> >>>>>>>>> At Fri, 15 Oct 2010 13:38:16 +0300, >>>>>>>>> Yuriy Kohut wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I'm using the following 'Getting Started' guide to configure simple cluster: >>>>>>>>>> http://www.osrg.net/sheepdog/usage.html >>>>>>>>>> >>>>>>>>>> I have configured cluster on 1 node/box, so the first questions are: >>>>>>>>>> Can I configure cluster on single node (1 box) under CentOS 5.5 x86_64 ? >>>>>>>>>> Is it required at least 3 nodes/boxes ... ? >>>>>>>>>> >>>>>>>>>> I have faced with the following issue on my single-node cluster. I have rebooted the box after my first image creation. The following done for that: >>>>>>>>>> - corosync is up and running >>>>>>>>>> udp 0 0 192.168.128.195:5404 0.0.0.0:* 3541/corosync >>>>>>>>>> udp 0 0 192.168.128.195:5405 0.0.0.0:* 3541/corosync >>>>>>>>>> udp 0 0 226.94.1.1:5405 0.0.0.0:* 3541/corosync >>>>>>>>>> >>>>>>>>>> - sheep is up and running >>>>>>>>>> tcp 0 0 0.0.0.0:7000 0.0.0.0:* LISTEN 3561/sheep >>>>>>>>>> >>>>>>>>>> - cluster is formatted with 1 copy only >>>>>>>>>> #collie cluster format --copies=1 >>>>>>>>>> >>>>>>>>>> - the image with prelocated data is created >>>>>>>>>> # qemu-img create sheepdog:test0 -o preallocation=data 4G >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> So after such siple steps I got: >>>>>>>>>> # collie vdi list >>>>>>>>>> name id size used shared creation time vdi id >>>>>>>>>> ------------------------------------------------------------------ >>>>>>>>>> test0 1 4.0 GB 4.0 GB 0.0 MB 2010-10-15 12:42 fd34af >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Then I rebooted the box, and no image(s) are available for me after box came back. The vdi list just show nothing: >>>>>>>>>> # collie vdi list >>>>>>>>>> name id size used shared creation time vdi id >>>>>>>>>> ------------------------------------------------------------------ >>>>>>>>>> >>>>>>>>>> and 'collie vdi list' never ends ... >>>>>>>>>> corosync and sheep are still running. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Could somebody assist me with that. >>>>>>>>> Sorry, the following patch will fixes the problem. >>>>>>>>> >>>>>>>>> = >>>>>>>>> From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> >>>>>>>>> Subject: [PATCH] sheep: call start_recovery when cluster restarts with one node >>>>>>>>> >>>>>>>>> Sheepdog recovers objects before starting a storage service, and the >>>>>>>>> routine is called when nodes are joined. However If sheepdog consists >>>>>>>>> of only one node, no node doesn't send join messages, so >>>>>>>>> start_recovery doesn't called. This patch fixes the problem. >>>>>>>>> >>>>>>>>> Signed-off-by: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> >>>>>>>>> --- >>>>>>>>> sheep/group.c | 3 +++ >>>>>>>>> 1 files changed, 3 insertions(+), 0 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/sheep/group.c b/sheep/group.c >>>>>>>>> index ba8cdfb..86cbdb8 100644 >>>>>>>>> --- a/sheep/group.c >>>>>>>>> +++ b/sheep/group.c >>>>>>>>> @@ -1226,6 +1226,9 @@ static void __sd_confchg_done(struct cpg_event *cevent) >>>>>>>>> >>>>>>>>> update_cluster_info(&msg); >>>>>>>>> >>>>>>>>> + if (sys->status == SD_STATUS_OK) /* sheepdog starts with one node */ >>>>>>>>> + start_recovery(sys->epoch, NULL, 0); >>>>>>>>> + >>>>>>>>> return; >>>>>>>>> } >>>>>>>>> >>>>>>>>> -- >>>>>>>>> 1.5.6.5 >>>>>>>>> >>>>>>>> -- >>>>>>>> sheepdog mailing list >>>>>>>> sheepdog at lists.wpkg.org >>>>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog >>>>>>> -- >>>>>>> sheepdog mailing list >>>>>>> sheepdog at lists.wpkg.org >>>>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog >>>>> -- >>>>> sheepdog mailing list >>>>> sheepdog at lists.wpkg.org >>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog >> |