<DIV>thanks Li Yuan ,zhangcanqun. finally,i debugger sheep run with gateway mode. i find one of the datanode server,join cluster with its private ip(i don't know how happen after i set bonding mode and reboot the server) .than as you know,get vdi state code wrong. than i reboot the problem of datanode server again. there datanode server join with ip right way,everything back to normal</DIV>
<DIV>
<DIV> <BR></DIV>
<DIV><BR></DIV>
<DIV style="PADDING-BOTTOM: 2px; PADDING-LEFT: 0px; PADDING-RIGHT: 0px; FONT-FAMILY: Arial Narrow; FONT-SIZE: 12px; PADDING-TOP: 2px">------------------ Original ------------------</DIV>
<DIV style="PADDING-BOTTOM: 8px; PADDING-LEFT: 8px; PADDING-RIGHT: 8px; BACKGROUND: #efefef; FONT-SIZE: 12px; PADDING-TOP: 8px">
<DIV><B>From: </B> "namei.unix";<namei.unix@gmail.com>;</DIV>
<DIV><B>Date: </B> Mon, Apr 13, 2015 01:50 PM</DIV>
<DIV><B>To: </B> "passedwind"<bailovereal@qq.com>; <WBR></DIV>
<DIV><B>Cc: </B> "zhangcanqun_sd"<zhangcanqun_sd@163.com>; "sheepdog"<sheepdog@lists.wpkg.org>; <WBR></DIV>
<DIV><B>Subject: </B> Re: [sheepdog] sheepdog cluster data area node with bonding mode 4with (Bonding Mode: IEEE 802.3ad Dynamic link aggregation) can't joingateway node</DIV></DIV>
<DIV><BR></DIV>On Sun, Apr 12, 2015 at 07:38:09PM +0800, passedwind wrote:<BR>> today i change my sheepdog cluster data area node network with bonding mode 4 with (Bonding Mode: IEEE 802.3ad Dynamic link aggregation).gateway node with normal network interface. gateway node to dataArea node network link no problem.<BR>> <BR>> anybody occur similar problem? help!!<BR>> <BR>> -----when add gateway node get fail. occur error, log at sheep.log file--------------------------------------<BR>> Apr 12 12:08:07 INFO [main] main(958) sheepdog daemon (version 0.9.1) started<BR>> Apr 12 12:08:07 INFO [main] local_vdi_state_snapshot_ctl(1388) freeing vdi state snapshot at epoch 4<BR>> Apr 12 12:08:07 EMERG [main] free_vdi_state_snapshot(1805) PANIC: invalid free request for vdi state snapshot, epoch: 4<BR><BR>It looks something wrong with your vdi state code, Hitoshi, any idea?<BR><BR>Thanks,<BR>Yuan<BR><BR>> Apr 12 12:08:07 EMERG [main] crash_handler(268) sheep exits unexpectedly (Aborted).<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) sheep() [0x406237]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0x1033f) [0x7faee25ad33f]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38) [0x7faee16ddcc8]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(abort+0x147) [0x7faee16e10d7]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) sheep() [0x414a94]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) sheep() [0x417812]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) sheep() [0x40ce0b]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) sheep() [0x4356f2]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) sheep() [0x42e89b]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) sheep() [0x405950]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf4) [0x7faee16c8ec4]<BR>> Apr 12 12:08:07 EMERG [main] sd_backtrace(847) sheep() [0x4060ba]<BR>> <BR>> -----add cluster vdi after gateway node connect get error ----------------------------------<BR>> Apr 12 13:32:21 INFO [main] md_add_disk(343) /var/lib/sheepdog/obj, vdisk nr 53, total disk 1<BR>> Apr 12 13:32:22 NOTICE [main] get_local_addr(522) found IPv4 address<BR>> Apr 12 13:32:22 INFO [main] send_join_request(1032) IPv4 ip:172.16.0.76 port:7000 going to join the cluster<BR>> Apr 12 13:32:22 NOTICE [main] nfs_init(608) nfs server service is not compiled<BR>> Apr 12 13:32:22 INFO [main] check_host_env(500) Allowed open files 65535, suggested 6144000<BR>> Apr 12 13:32:22 INFO [main] main(958) sheepdog daemon (version 0.9.1) started<BR>> Apr 12 13:32:22 ERROR [block] sheep_exec_req(1170) failed The buffer is too small, remote address: 172.16.0.73:7000, op name: VDI_STATE_SNAPSHOT_CTL<BR>> Apr 12 13:32:22 ERROR [block] sheep_exec_req(1170) failed The buffer is too small, remote address: 172.16.0.73:7000, op name: VDI_STATE_SNAPSHOT_CTL<BR>> Apr 12 13:32:22 INFO [main] local_vdi_state_snapshot_ctl(1388) freeing vdi state snapshot at epoch 10<BR>> Apr 12 13:32:22 EMERG [main] free_vdi_state_snapshot(1805) PANIC: invalid free request for vdi state snapshot, epoch: 10<BR>> Apr 12 13:32:22 EMERG [main] crash_handler(268) sheep exits unexpectedly (Aborted).<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) sheep() [0x406237]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0x1033f) [0x7f4648b5933f]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38) [0x7f4647c89cc8]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(abort+0x147) [0x7f4647c8d0d7]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) sheep() [0x414a94]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) sheep() [0x417812]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) sheep() [0x40ce0b]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) sheep() [0x4356f2]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) sheep() [0x42e89b]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) sheep() [0x405950]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf4) [0x7f4647c74ec4]<BR>> Apr 12 13:32:22 EMERG [main] sd_backtrace(847) sheep() [0x4060ba]<BR>> ------------------------------------------------------------------------------------<BR>> where can i change this request buffer size?<BR>> <BR>> worker_fn int sheep_exec_req(const struct node_id *nid, struct sd_req *hdr,<BR>> void *buf)<BR>> {<BR>> struct sd_rsp *rsp = (struct sd_rsp *)hdr;<BR>> struct sockfd *sfd;<BR>> int ret;<BR>> sfd = sockfd_cache_get(nid);<BR>> if (!sfd)<BR>> return SD_RES_NETWORK_ERROR;<BR>> ret = exec_req(sfd->fd, hdr, buf, sheep_need_retry, hdr->epoch,<BR>> MAX_RETRY_COUNT);<BR>> if (ret) {<BR>> sd_debug("remote node might have gone away");<BR>> sockfd_cache_del(nid, sfd);<BR>> return SD_RES_NETWORK_ERROR;<BR>> }<BR>> ret = rsp->result;<BR>> if (ret != SD_RES_SUCCESS)<BR>> sd_err("failed %s, remote address: %s, op name: %s",<BR>> sd_strerror(ret),<BR>> <BR>> static int local_vdi_state_snapshot_ctl(const struct sd_req *req,<BR>> struct sd_rsp *rsp, void *data,<BR>> const struct sd_node *sender)<BR>> {<BR>> bool get = !!req->vdi_state_snapshot.get;<BR>> int epoch = req->vdi_state_snapshot.tgt_epoch;<BR>> int ret, length = 0;<BR>> sd_info("%s vdi state snapshot at epoch %d",<BR>> get ? "getting" : "freeing", epoch);<BR>> if (get) {<BR>> ret = get_vdi_state_snapshot(epoch, data, req->data_length,<BR>> &length);<BR>> if (ret == SD_RES_SUCCESS)<BR>> rsp->data_length = length;<BR>> else {<BR>> sd_err("failed to get vdi state snapshot: %s",<BR>> sd_strerror(ret));<BR>> return ret;<BR>> }<BR>> } else<BR>> free_vdi_state_snapshot(epoch);<BR>> return SD_RES_SUCCESS;<BR>> }<BR>> -----------------------------<BR>> erro on datanode sheep.log<BR>> -----------------------------------------------------------------------------------<BR>> Apr 12 14:42:59 NOTICE [main] nfs_init(608) nfs server service is not compiled<BR>> Apr 12 14:42:59 INFO [main] check_host_env(500) Allowed open files 1048576, suggested 6144000<BR>> Apr 12 14:42:59 INFO [main] main(958) sheepdog daemon (version 0.9.1) started<BR>> Apr 12 14:44:54 INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 0<BR>> Apr 12 14:44:54 INFO [main] get_vdi_state_snapshot(1783) maximum allowed length: 512, required length: 2840<BR>> Apr 12 14:44:54 ERROR [main] local_vdi_state_snapshot_ctl(1397) failed to get vdi state snapshot: The buffer is too small<BR>> Apr 12 14:44:54 INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 0<BR>> Apr 12 14:44:54 INFO [main] get_vdi_state_snapshot(1783) maximum allowed length: 1024, required length: 2840<BR>> Apr 12 14:44:54 ERROR [main] local_vdi_state_snapshot_ctl(1397) failed to get vdi state snapshot: The buffer is too small<BR>> Apr 12 14:44:54 INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 0<BR>> Apr 12 14:44:54 INFO [main] get_vdi_state_snapshot(1783) maximum allowed length: 2048, required length: 2840<BR>> Apr 12 14:44:54 ERROR [main] local_vdi_state_snapshot_ctl(1397) failed to get vdi state snapshot: The buffer is too small<BR>-- <BR>sheepdog mailing list<BR>sheepdog@lists.wpkg.org<BR>https://lists.wpkg.org/mailman/listinfo/sheepdog
<DIV></DIV></DIV>