Discussion:
[libtorrent] Deadlock with Deluge
minus
2016-09-20 16:22:06 UTC
Permalink
Hello everyone,

after upgrading my Deluge (develop branch) and libtorrent (1.0.9 to
1.1.1 on Arch Linux) recently I ran into a deadlock.
(After upgrading I started Deluge and Ctrl-C'd it during loading, so
that could have caused the situation.)
Deluge loads a couple of torrents (~20 out of ~800) and then gets stuck.
strace'ing the process it hangs with this:

...
[pid 21341] stat("/some/file", {st_mode=S_IFREG|0644, st_size=113172,
...}) = 0
[pid 21341] stat("/another/file", {st_mode=S_IFREG|0644, st_size=13411,
...}) = 0
[pid 21341] futex(0x557baca7e36c, FUTEX_WAIT_PRIVATE, 39, NULL
<unfinished ...>


Attaching gdb and running /info threads/ I always get this situation:
(gdb) info threads
Id Target Id Frame
1 Thread 0x7fc41886a380 (LWP 21332) "deluged" 0x00007fc41806c10f in
pthread_cond_wait@@GLIBC_2.3.2 ()
from /usr/lib/libpthread.so.0
2 Thread 0x7fc40f2db700 (LWP 21339) "deluged" 0x00007fc41806e4f7 in
do_futex_wait.constprop ()
from /usr/lib/libpthread.so.0
* 3 Thread 0x7fc40eada700 (LWP 21340) "deluged" 0x00007fc41806c10f in
pthread_cond_wait@@GLIBC_2.3.2 ()
from /usr/lib/libpthread.so.0
4 Thread 0x7fc40e2d9700 (LWP 21341) "deluged" 0x00007fc41806c10f in
pthread_cond_wait@@GLIBC_2.3.2 ()
from /usr/lib/libpthread.so.0
5 Thread 0x7fc40dad8700 (LWP 21342) "deluged" 0x00007fc41806c10f in
pthread_cond_wait@@GLIBC_2.3.2 ()
from /usr/lib/libpthread.so.0
6 Thread 0x7fc40d2d7700 (LWP 21343) "deluged" 0x00007fc41806c10f in
pthread_cond_wait@@GLIBC_2.3.2 ()
from /usr/lib/libpthread.so.0
7 Thread 0x7fc40cad6700 (LWP 21344) "deluged" 0x00007fc41806c10f in
pthread_cond_wait@@GLIBC_2.3.2 ()
from /usr/lib/libpthread.so.0

where there'd usually be an epoll_wait call among them.

Since this the issue goes away if I wipe my Deluge state it's not
reproducible without my Deluge state, so my main question is: how do I
debug this? The Deluge instance runs on a remote machine, libtorrent is
compiled with debug symbols there, so I can even get backtraces
(although it's terribly slow).

Thanks for reading and best regards,
minus

------------------------------------------------------------------------------
Arvid Norberg
2016-10-02 16:27:56 UTC
Permalink
Post by minus
Hello everyone,
after upgrading my Deluge (develop branch) and libtorrent (1.0.9 to
1.1.1 on Arch Linux) recently I ran into a deadlock.
(After upgrading I started Deluge and Ctrl-C'd it during loading, so
that could have caused the situation.)
Deluge loads a couple of torrents (~20 out of ~800) and then gets stuck.
[...]
(gdb) info threads
Id Target Id Frame
1 Thread 0x7fc41886a380 (LWP 21332) "deluged" 0x00007fc41806c10f in
from /usr/lib/libpthread.so.0
2 Thread 0x7fc40f2db700 (LWP 21339) "deluged" 0x00007fc41806e4f7 in
do_futex_wait.constprop ()
from /usr/lib/libpthread.so.0
* 3 Thread 0x7fc40eada700 (LWP 21340) "deluged" 0x00007fc41806c10f in
from /usr/lib/libpthread.so.0
4 Thread 0x7fc40e2d9700 (LWP 21341) "deluged" 0x00007fc41806c10f in
from /usr/lib/libpthread.so.0
5 Thread 0x7fc40dad8700 (LWP 21342) "deluged" 0x00007fc41806c10f in
from /usr/lib/libpthread.so.0
6 Thread 0x7fc40d2d7700 (LWP 21343) "deluged" 0x00007fc41806c10f in
from /usr/lib/libpthread.so.0
7 Thread 0x7fc40cad6700 (LWP 21344) "deluged" 0x00007fc41806c10f in
from /usr/lib/libpthread.so.0
The best way of understanding deadlocks (if that is in fact what's causing
this freeze) is to understand which mutex(es) and thrreads are involved. If
you could dump the stack trace of every thread, I think it should be
possible to see which mutexes each thread should be holding and which ones
they're waiting on.

Would you mind reproducing this and posting the output from:

(gdb) thread apply all bt full -n

(there's going to be quite a lot of output, so you may want to make sure
you get it into a terminal that you can easily copy multiple screens out
of).

thanks!
--
Arvid Norberg
Continue reading on narkive:
Loading...