[sheepdog] BUG: dirty object cache stop pushing

徐舫 xufango at gmail.com
Fri Jul 4 06:21:21 CEST 2014


If 5 sheepdog nodes are running with cache, and more than 10 vms running on
each node.

I mount a tmpfs to /cache directory, and start sheep with:

sheep -l level=debug -n
/home/admin/sheepdogmetadata,/disk1/sheepdogstoredata,/disk2/sheepdogstoredata,/disk3/sheepdogstoredata,/disk4/sheepdogstoredata,/disk5/sheepdogstoredata,/disk7/sheepdogstoredata,/disk8/sheepdogstoredata,/disk9/sheepdogstoredata
-w size=20G dir=/cache -b 0.0.0.0 -y **.**.**.** -c
zookeeper:**.**.**.**:2181

There is a possibility that all object push threads are
running do_background_push work, and no threads is running do_push_object
work.

In my test environment, this occurs:

[1] 13:09:30 [SUCCESS] vmsecdomainhost1
Name     Tag     Total     Dirty     Clean
win7_type4_node8.img          4.7 GB     4.7 GB     4.0 MB
standard.img     images     0.0 MB     0.0 MB     0.0 MB
win7_type4_node1.img          4.8 GB     4.8 GB     28 MB
win7_type4_node10.img          5.0 GB     4.9 GB     32 MB
win7_type4_node2.img          4.7 GB     4.6 GB     68 MB
win7_type4_node3.img          4.7 GB     4.7 GB     4.0 MB
win7_type4_node6.img          4.8 GB     4.7 GB     40 MB
win7_type4_node4.img          4.8 GB     4.7 GB     20 MB
win7_type4_node7.img          4.8 GB     4.8 GB     24 MB
win7_type4_node9.img          4.7 GB     4.7 GB     32 MB
win7_type4_node5.img          4.2 GB     4.2 GB     8.0 MB

Cache size 20 GB, used 47 GB, non-directio


I found that, 7 object push threads are working with work_queue "oc_push",
and their call stacks are:

Thread 37 (Thread 0x7f3c2a1fc700 (LWP 116747)):
#0  0x0000003916eda37d in read () from /lib64/libc.so.6
#1  0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6
#2  0x000000000042a89d in eventfd_xread ()
#3  0x0000000000419acb in object_cache_push ()
*#4  0x0000000000419b83 in do_background_push ()*
#5  0x000000000042e56a in worker_routine ()
#6  0x0000003917207851 in start_thread () from /lib64/libpthread.so.0
#7  0x0000003916ee767d in clone () from /lib64/libc.so.6

Thread 36 (Thread 0x7f3c2abfd700 (LWP 116775)):
#0  0x0000003916eda37d in read () from /lib64/libc.so.6
#1  0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6
#2  0x000000000042a89d in eventfd_xread ()
#3  0x0000000000419acb in object_cache_push ()
*#4  0x0000000000419b83 in do_background_push ()*
#5  0x000000000042e56a in worker_routine ()
#6  0x0000003917207851 in start_thread () from /lib64/libpthread.so.0
#7  0x0000003916ee767d in clone () from /lib64/libc.so.6

Thread 35 (Thread 0x7f3b5d7fb700 (LWP 116889)):
#0  0x0000003916eda37d in read () from /lib64/libc.so.6
#1  0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6
#2  0x000000000042a89d in eventfd_xread ()
#3  0x0000000000419acb in object_cache_push ()
*#4  0x0000000000419b83 in do_background_push ()*
#5  0x000000000042e56a in worker_routine ()
#6  0x0000003917207851 in start_thread () from /lib64/libpthread.so.0
#7  0x0000003916ee767d in clone () from /lib64/libc.so.6

Thread 34 (Thread 0x7f3b4ffff700 (LWP 116891)):
#0  0x0000003916eda37d in read () from /lib64/libc.so.6
#1  0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6
#2  0x000000000042a89d in eventfd_xread ()
#3  0x0000000000419acb in object_cache_push ()
*#4  0x0000000000419b83 in do_background_push ()*
#5  0x000000000042e56a in worker_routine ()
#6  0x0000003917207851 in start_thread () from /lib64/libpthread.so.0
#7  0x0000003916ee767d in clone () from /lib64/libc.so.6

Thread 33 (Thread 0x7f3ac8dfa700 (LWP 117040)):
#0  0x0000003916eda37d in read () from /lib64/libc.so.6
#1  0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6
#2  0x000000000042a89d in eventfd_xread ()
#3  0x0000000000419acb in object_cache_push ()
*#4  0x0000000000419b83 in do_background_push ()*
#5  0x000000000042e56a in worker_routine ()
#6  0x0000003917207851 in start_thread () from /lib64/libpthread.so.0
#7  0x0000003916ee767d in clone () from /lib64/libc.so.6

Thread 32 (Thread 0x7f3ac83f9700 (LWP 117041)):
#0  0x0000003916eda37d in read () from /lib64/libc.so.6
#1  0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6
#2  0x000000000042a89d in eventfd_xread ()
#3  0x0000000000419acb in object_cache_push ()
*#4  0x0000000000419b83 in do_background_push ()*
#5  0x000000000042e56a in worker_routine ()
#6  0x0000003917207851 in start_thread () from /lib64/libpthread.so.0
#7  0x0000003916ee767d in clone () from /lib64/libc.so.6

Thread 31 (Thread 0x7f3ac65f6700 (LWP 117044)):
#0  0x0000003916eda37d in read () from /lib64/libc.so.6
#1  0x0000003916ee7a1e in eventfd_read () from /lib64/libc.so.6
#2  0x000000000042a89d in eventfd_xread ()
#3  0x0000000000419acb in object_cache_push ()
*#4  0x0000000000419b83 in do_background_push ()*
#5  0x000000000042e56a in worker_routine ()
#6  0x0000003917207851 in start_thread () from /lib64/libpthread.so.0
#7  0x0000003916ee767d in clone () from /lib64/libc.so.6

No threads are pushing objects, so no object_cache_push work finished.


In gdb,  we can see the information of each object cache in
object_cache_push:

vid = 9627038, push_count = 26, dirty_count = 150, total_count = 154
vid = 3508964, push_count = 22, dirty_count = 1456, total_count = 1464
vid = 360229, push_count = 18, dirty_count = 1437, total_count = 1444
vid = 9678955, push_count = 34, dirty_count = 1462, total_count = 1470
vid = 9008538, push_count = 17, dirty_count = 1490, total_count = 1493
vid = 2383510, push_count = 28, dirty_count = 1494, total_count = 1498
vid = 16192623, push_count = 19, dirty_count = 1447, total_count = 1451

push_count is far less than dirty_count, and no threads is
doing do_push_object work, so

static void do_push_object(struct work *work)
     if (uatomic_sub_return(&oc->push_count, 1) == 0)
          eventfd_xwrite(oc->push_efd, 1);

will never be kicked.

And in

static bool wq_need_grow(struct wq_info *wi)
{
     if (wi->nr_threads < uatomic_read(&wi->nr_queued_work) &&
         wi->nr_threads * 2 <= wq_get_roof(wi)) {
          wi->tm_end_of_protection = get_msec_time() +
               WQ_PROTECTION_PERIOD;
          return true;
     }

     return false;
}

nr_threads is 7,  wq_get_roof(wi) returns 10( 2 * five nodes).
so no more threads will be created, and all threads are waiting
for do_push_object finished.


Hope that the above information is clearly for everyone.


Let's discuss the solution now.
The oc_push_wqueue is created with WQ_DYNAMIC:

sys->oc_push_wqueue = create_work_queue("oc_push", WQ_DYNAMIC)

So the roof of threads number will be

     case WQ_DYNAMIC:
          /* FIXME: 2 * nr_nodes threads. No rationale yet. */
          nr = nr_nodes * 2;
          break;

There are also other work queue created with  WQ_DYNAMIC:

wq = create_work_queue("vdi check", WQ_DYNAMIC);

sys->http_wqueue = create_work_queue("http", WQ_DYNAMIC);


oc_push created with WQ_UNLIMITED is not rational too.


*I think that, the nr_threads working with oc_push should be (2 * number of
object cache), not (2 * nr_nodes), to ensure that there will be always
enougth threads doing do_push_object work.*


With your advises, I wish to submit patches to solve this problem.


Thanks.

-- 
Xu Fang

Beijing,P.R.China
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20140704/9f5872b9/attachment-0003.html>


More information about the sheepdog mailing list