[sheepdog] sheepdog cluster data area node with bonding mode 4 with (Bonding Mode: IEEE 802.3ad Dynamic link aggregation) can't join gateway node

Liu Yuan namei.unix at gmail.com
Mon Apr 13 07:50:24 CEST 2015


On Sun, Apr 12, 2015 at 07:38:09PM +0800, passedwind wrote:
> today i change my sheepdog cluster data area node  network with bonding mode 4 with (Bonding Mode: IEEE 802.3ad Dynamic link aggregation).gateway node with normal network interface.  gateway node to dataArea node network link no problem.
>   
>  anybody occur similar problem? help!!
>   
>  -----when add gateway node get fail.  occur error,  log at sheep.log file--------------------------------------
>  Apr 12 12:08:07   INFO [main] main(958) sheepdog daemon (version 0.9.1) started
> Apr 12 12:08:07   INFO [main] local_vdi_state_snapshot_ctl(1388) freeing vdi state snapshot at epoch 4
> Apr 12 12:08:07  EMERG [main] free_vdi_state_snapshot(1805) PANIC: invalid free request for vdi state snapshot, epoch: 4

It looks something wrong with your vdi state code, Hitoshi, any idea?

Thanks,
Yuan

> Apr 12 12:08:07  EMERG [main] crash_handler(268) sheep exits unexpectedly (Aborted).
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) sheep() [0x406237]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0x1033f) [0x7faee25ad33f]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38) [0x7faee16ddcc8]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(abort+0x147) [0x7faee16e10d7]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) sheep() [0x414a94]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) sheep() [0x417812]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) sheep() [0x40ce0b]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) sheep() [0x4356f2]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) sheep() [0x42e89b]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) sheep() [0x405950]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf4) [0x7faee16c8ec4]
> Apr 12 12:08:07  EMERG [main] sd_backtrace(847) sheep() [0x4060ba]
> 
>  -----add cluster vdi after  gateway node connect get error ----------------------------------
>  Apr 12 13:32:21   INFO [main] md_add_disk(343) /var/lib/sheepdog/obj, vdisk nr 53, total disk 1
> Apr 12 13:32:22 NOTICE [main] get_local_addr(522) found IPv4 address
> Apr 12 13:32:22   INFO [main] send_join_request(1032) IPv4 ip:172.16.0.76 port:7000 going to join the cluster
> Apr 12 13:32:22 NOTICE [main] nfs_init(608) nfs server service is not compiled
> Apr 12 13:32:22   INFO [main] check_host_env(500) Allowed open files 65535, suggested 6144000
> Apr 12 13:32:22   INFO [main] main(958) sheepdog daemon (version 0.9.1) started
> Apr 12 13:32:22  ERROR [block] sheep_exec_req(1170) failed The buffer is too small, remote address: 172.16.0.73:7000, op name: VDI_STATE_SNAPSHOT_CTL
> Apr 12 13:32:22  ERROR [block] sheep_exec_req(1170) failed The buffer is too small, remote address: 172.16.0.73:7000, op name: VDI_STATE_SNAPSHOT_CTL
> Apr 12 13:32:22   INFO [main] local_vdi_state_snapshot_ctl(1388) freeing vdi state snapshot at epoch 10
> Apr 12 13:32:22  EMERG [main] free_vdi_state_snapshot(1805) PANIC: invalid free request for vdi state snapshot, epoch: 10
> Apr 12 13:32:22  EMERG [main] crash_handler(268) sheep exits unexpectedly (Aborted).
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) sheep() [0x406237]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0x1033f) [0x7f4648b5933f]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38) [0x7f4647c89cc8]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(abort+0x147) [0x7f4647c8d0d7]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) sheep() [0x414a94]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) sheep() [0x417812]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) sheep() [0x40ce0b]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) sheep() [0x4356f2]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) sheep() [0x42e89b]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) sheep() [0x405950]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf4) [0x7f4647c74ec4]
> Apr 12 13:32:22  EMERG [main] sd_backtrace(847) sheep() [0x4060ba]
>  ------------------------------------------------------------------------------------
>  where can i  change this request buffer size?
>   
>  worker_fn int sheep_exec_req(const struct node_id *nid, struct sd_req *hdr,
>                              void *buf)
> {
>         struct sd_rsp *rsp = (struct sd_rsp *)hdr;
>         struct sockfd *sfd;
>         int ret;
>          sfd = sockfd_cache_get(nid);
>         if (!sfd)
>                 return SD_RES_NETWORK_ERROR;
>          ret = exec_req(sfd->fd, hdr, buf, sheep_need_retry, hdr->epoch,
>                        MAX_RETRY_COUNT);
>         if (ret) {
>                 sd_debug("remote node might have gone away");
>                 sockfd_cache_del(nid, sfd);
>                 return SD_RES_NETWORK_ERROR;
>         }
>         ret = rsp->result;
>         if (ret != SD_RES_SUCCESS)
>                 sd_err("failed %s, remote address: %s, op name: %s",
>                                 sd_strerror(ret),
> 
>  static int local_vdi_state_snapshot_ctl(const struct sd_req *req,
>                                         struct sd_rsp *rsp, void *data,
>                                         const struct sd_node *sender)
> {
>         bool get = !!req->vdi_state_snapshot.get;
>         int epoch = req->vdi_state_snapshot.tgt_epoch;
>         int ret, length = 0;
>          sd_info("%s vdi state snapshot at epoch %d",
>                 get ? "getting" : "freeing", epoch);
>          if (get) {
>                 ret = get_vdi_state_snapshot(epoch, data, req->data_length,
>                                              &length);
>                 if (ret == SD_RES_SUCCESS)
>                         rsp->data_length = length;
>                 else {
>                         sd_err("failed to get vdi state snapshot: %s",
>                                sd_strerror(ret));
>                          return ret;
>                 }
>         } else
>                 free_vdi_state_snapshot(epoch);
>          return SD_RES_SUCCESS;
> }
>  -----------------------------
>  erro on datanode sheep.log
>  -----------------------------------------------------------------------------------
>  Apr 12 14:42:59 NOTICE [main] nfs_init(608) nfs server service is not compiled
> Apr 12 14:42:59   INFO [main] check_host_env(500) Allowed open files 1048576, suggested 6144000
> Apr 12 14:42:59   INFO [main] main(958) sheepdog daemon (version 0.9.1) started
> Apr 12 14:44:54   INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 0
> Apr 12 14:44:54   INFO [main] get_vdi_state_snapshot(1783) maximum allowed length: 512, required length: 2840
> Apr 12 14:44:54  ERROR [main] local_vdi_state_snapshot_ctl(1397) failed to get vdi state snapshot: The buffer is too small
> Apr 12 14:44:54   INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 0
> Apr 12 14:44:54   INFO [main] get_vdi_state_snapshot(1783) maximum allowed length: 1024, required length: 2840
> Apr 12 14:44:54  ERROR [main] local_vdi_state_snapshot_ctl(1397) failed to get vdi state snapshot: The buffer is too small
> Apr 12 14:44:54   INFO [main] local_vdi_state_snapshot_ctl(1388) getting vdi state snapshot at epoch 0
> Apr 12 14:44:54   INFO [main] get_vdi_state_snapshot(1783) maximum allowed length: 2048, required length: 2840
> Apr 12 14:44:54  ERROR [main] local_vdi_state_snapshot_ctl(1397) failed to get vdi state snapshot: The buffer is too small



More information about the sheepdog mailing list