[stgt] [PATCH 1/3] Fix race on thread shutdown causing deadlock
FUJITA Tomonori
fujita.tomonori at lab.ntt.co.jp
Wed Apr 30 16:29:06 CEST 2014
On Mon, 28 Apr 2014 18:51:20 -0700
Andy Grover <agrover at redhat.com> wrote:
> This patch and the next are somewhat a revert of 318e9f2, but the previous
> fix didn't quite close the race. This only happens when we create threads
> for a backstore that turns out to be invalid, which we then tear down.
>
> See https://bugzilla.redhat.com/show_bug.cgi?id=848585 .
>
> This is occurring because there's still a window where a thread misses
> seeing info->stop == 1 but is not yet in cond_wait so it misses the
> broadcast:
>
> thread_close: thread_worker_fn:
> info->stop is seen as 0
> info->stop = 1
> pthread_cond_broadcast -- misses broadcast
> pthread_cond_wait
> pthread_join (hangs)
>
> I believe the solution is to go back to using pthread_cancel. We can call
> it before pthread_cond_wait is called (or after) and it will do the right
> thing: pop out and exit. The only tricky bit is we need to use the
> pthread_cleanup_push mechanism to properly release info->pending_lock.
>
> Signed-off-by: Andy Grover <agrover at redhat.com>
> ---
> usr/bs.c | 25 ++++++++++++++-----------
> usr/bs_thread.h | 2 --
> 2 files changed, 14 insertions(+), 13 deletions(-)
Thanks a lot for the fixes and detailed explanation. Surely, looks
like there is a race. The whole patchset looks good. Applied, thanks!
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
More information about the stgt
mailing list