[Sheepdog] some questions about sheepdog's code?

Tue Mar 8 19:16:31 CET 2011

At Mon, 7 Mar 2011 17:49:19 +0800,
jidalyg_8711 wrote:
> Hi,Kazutaka
> 
> I'm reading the source code of the sheepdog, and have some questions.
> 1.  sheep/group.c
>    
>   787     if (m->state == DM_INIT && is_master()) {
>   788         switch (m->op) {
>   789         case SD_MSG_JOIN:
>   790             break;     
>   791         case SD_MSG_VDI_OP:
>   792             vdi_op((struct vdi_op_message *)m);
>   793             break;     
>   794         default:
>   795             eprintf("unknown message %d\n", m->op);
>   796             break;     
>   797         }              
>   798     }
> 
> Why the operation of the vdi must be done by master? When  other nodes do the vdi operations? how about the synchronization ?

To make Sheepdog implementation simple, We assumes that Sheepdog
objects are not written concurrently by multiple clients.

We serialize vdi operations with an atomic multicast of Corosync.  All
vdi operations are forwarded to all nodes, and the master node (which
is elected automatically) processes the operations.

> 
> 2. after delete the vdi ,but the vdi object doesn'g delete ,and the data object delete . why ?

It is because of the vdi lookup of Sheepdog.

Sheepdog calculates a vdi identifier number with a hash value of the
vdi name.  If the result of calculation is already used, the
incremented number is used.  If we delete vdi objects and remove the
identifier number from the bitmap, we cannot lookup the vdis which use
incremented number.

This looks the waste of disk space, but we can truncate the vdi object
size (though we don't do it yet).

Thanks,

Kazutaka