[Sheepdog] Cluster appears down but nodes report different epochs
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Thu Nov 10 08:11:42 CET 2011
At Wed, 9 Nov 2011 09:34:29 -0500,
Shawn Moore wrote:
>
> On Tue, Nov 8, 2011 at 9:44 PM, MORITA Kazutaka
> <morita.kazutaka at lab.ntt.co.jp> wrote:
> > At Tue, 8 Nov 2011 10:56:51 -0500,
> > Shawn Moore wrote:
> >>
> >> > Probably, we need to add support for using different NICs for data
> >> > I/Os and monitoring.
> >>
> >> We currently have 4 1Gb nics bonded together using mode 4
> >> (LACP/802.3ad). On another note, I am still doing more testing, but
> >> it almost looks like the TOTAL cluster speed might be limited to 1Gb
> >> instead of up to 4Gb (I know I can't get that much in truth though,
> >> but should be higher than 1Gb). Does anyone have any insight into
> >> this? We are using enterprise grade switches with two cards (18Gb/s
> >> fabric).
> >
> > How did you measure the total cluster speed? Could your disks be a
> > bottleneck?
>
> What I have been doing is using pssh (parallel ssh) to kick off a
> script on all nodes at the same time to create a pre-allocated vdi of
> the same size. Then obtain the time it takes for the operation to
> complete on each node and then do some math. Below is the script and
> an example of the data obtained. I did go ahead a re-run my tests
> with tmpfs instead of true ext4 disks. This did yield quite a
> performance boost, but seems it actually should have been faster given
> the usage of ram for the disks.
>
> T_START="$(date +%s)"
> collie vdi create test_$(uname -n) ${1}G -P
> T_STOP="$(date +%s)"
> expr ${T_STOP} - ${T_START}
>
>
> This is a run using the disks. This created 9 3GB vdi's. With copies
> 3, that created 27GB worth of vdi data and 81GB worth of total cluster
> data. Total cluster speed of 1.46Gb/s.
> 152 416 sec
> 153 462 sec
> 154 394 sec
> 155 435 sec
> 156 451 sec
> 157 483 sec
> 159 459 sec
> 160 406 sec
> 161 486 sec
> VDI SIZE 3
> COPIES 3
> TOT sec 3992
> AVG sec 443.5555556
> 27 GB
> 27648 MB
> 62.33266533 MB/s
> 81 GB
> 82944 MB
> 186.997996 MB/s
> 1495.983968 Mb/s
> 1.460921844 Gb/s
>
>
> This is a run using tmpfs. This created 9 3GB vdi's. With copies 3,
> that created 27GB worth of vdi data and 81GB worth of total cluster
> data. Total cluster speed of 5.73Gb/s.
Hmm, actually the performance looks worse than what I expected. I'll
dig into this issue after implementing the accord cluster driver.
Thanks,
Kazutaka
> 152 80 sec
> 153 119 sec
> 154 127 sec
> 155 118 sec
> 156 116 sec
> 157 142 sec
> 159 80 sec
> 160 125 sec
> 161 110 sec
> VDI SIZE 3
> COPIES 3
> TOT sec 1017
> AVG sec 113
> 27 GB
> 27648 MB
> 244.6725664 MB/s
> 81 GB
> 82944 MB
> 734.0176991 MB/s
> 5872.141593 Mb/s
> 5.734513274 Gb/s
> --
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog
More information about the sheepdog
mailing list