Maj*_*han 3 c++ linux multithreading
我的所有线程都卡在一个点上,此时的跟踪如下:
(gdb) info threads
9 Thread 0x7fa872994700 (LWP 10301) 0x000000327b60e264 in __lll_lock_wait () from /lib64/libpthread.so.0
8 Thread 0x7fa87379c700 (LWP 10302) 0x000000327b2accdd in nanosleep () from /lib64/libc.so.6
7 Thread 0x7fa871b7c700 (LWP 10303) 0x000000327b2db74d in read () from /lib64/libc.so.6
6 Thread 0x7fa87117b700 (LWP 10306) 0x000000327b60e264 in __lll_lock_wait () from /lib64/libpthread.so.0
5 Thread 0x7fa864e14700 (LWP 10307) 0x000000327b60e264 in __lll_lock_wait () from /lib64/libpthread.so.0
4 Thread 0x7fa85ffff700 (LWP 10308) 0x000000327b2db7ad in write () from /lib64/libc.so.6
3 Thread 0x7fa85f5fe700 (LWP 10309) 0x000000327b60e264 in __lll_lock_wait () from /lib64/libpthread.so.0
2 Thread 0x7fa85ebfd700 (LWP 10311) 0x000000327b2accdd in nanosleep () from /lib64/libc.so.6
* 1 Thread 0x7fa87379e720 (LWP 10300) 0x000000327b60822d in pthread_join () from /lib64/libpthread.so.0
Run Code Online (Sandbox Code Playgroud)
我试图找出这是否与我的代码或系统配置的任何问题有关。它适用于所有其他机器。每次运行时,该问题仅发生在一台计算机上。该机详细配置如下:
bash-4.1$ cat /etc/redhat-release 红帽企业 Linux 服务器版本 6.5(圣地亚哥)
bash-4.1$ uname -a Linux localhost 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
bash-4.1$ rpm -qa |grep glibc glibc-devel-2.12-1.132.el6.x86_64 glibc-2.12-1.132.el6.x86_64 glibc-common-2.12-1.132.el6.x86_64 glibc-headers-2.12-1.132.el6 .x86_64
另外供参考,下面是线程没有卡住的机器的配置(工作正常):
> cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.3 (Santiago)
> uname -a
Linux localhost 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
> rpm -qa |grep glibc
glibc-headers-2.12-1.80.el6.x86_64
compat-glibc-headers-2.5-46.2.x86_64
compat-glibc-2.5-46.2.x86_64
glibc-devel-2.12-1.80.el6.x86_64
glibc-common-2.12-1.80.el6.x86_64
glibc-2.12-1.80.el6.i686
glibc-devel-2.12-1.80.el6.i686
glibc-2.12-1.80.el6.x86_64
Run Code Online (Sandbox Code Playgroud)
正如这个答案/sf/answers/244391311/中所建议的,查看正在等待回溯的每个线程,
(gdb) thr 9
(gdb) bt
#0 0x00007f5e45c553dd in __lll_lock_wait () at /lib64/libpthread.so.0
#1 0x00007f5e45c4e7d4 in pthread_mutex_lock () at /lib64/libpthread.so.0
#2 0x00007f5e458cc84f in gst_element_set_state_func (element=0x7f5d94461ca0, state=GST_STATE_READY) at gstelement.c:2831
Run Code Online (Sandbox Code Playgroud)
转至锁定互斥体的堆栈帧,并在互斥体中查看锁定器的线程 id。
(gdb) f 2 # look frame 2, as an example
#2 0x00007f5e458cc84f in gst_element_set_state_func (element=0x7f5d94461ca0, state=GST_STATE_READY)
at gstelement.c:2831
2831 GST_STATE_LOCK (element);
Run Code Online (Sandbox Code Playgroud)
找到试图锁定的互斥体的符号,并打印它的内容
(gdb) p element.state_lock
$3 = {p = 0x7f5d0c03f2a0, i = {0, 0}}
(gdb) p *(struct __pthread_mutex_s *)element.state_lock.p
$6 = {__lock = 2, __count = 1, __owner = 11889, __nusers = 1, __kind = 1, __spins = 0, __elision = 0,
__list = {__prev = 0x0, __next = 0x0}}
Run Code Online (Sandbox Code Playgroud)
如果你没有符号但有地址,你可以通过检查内存将其打印出来。
(gdb) x/4x 0x7f5d0c03f2a0 # address of the mutex
0x7f5d0c03f2a0: 0x00000002 0x00000001 0x00002e71 0x00000001
(gdb) p 0x2e71
$7 = 11889
Run Code Online (Sandbox Code Playgroud)
在当前版本的 Linux pthreads 上,所有者位于第三个值中。如上面的问题 LWP #10311 所示,查看线程 2,看看为什么被阻塞。或者在此示例中,LWP #11889,线程 18。
(gdb) info thr
[ ... ]
18 Thread 0x7f5dc9dff700 (LWP 11889) "task114" 0x00007f5e45c5203c in pthread_cond_wait@@GLIBC_2.3.2
(gdb) thr 18
(gdb) bt
#0 0x00007f5e45c5203c in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
[ ... ]
Run Code Online (Sandbox Code Playgroud)