android NDK mutex locking

foo*_*o64 5 android mutex pthreads android-ndk

I've been porting a cross platform C++ engine to Android, and noticed that it will inexplicably (and inconsistently) block when calling pthread_mutex_lock. This engine has already been working for many years on several platforms, and the problematic code hasn't changed in years, so I doubt it's a deadlock or otherwise buggy code. It must be my port to Android..

So far there are several places in the code that block on pthread_mutex_lock. It isn't entirely reproducible either. When it hangs, there's no suspicious output in LogCat.

I modified the mutex code like this (edited for brevity... real code checks all return values):

void MutexCreate( Mutex* m )
{
#ifdef WINDOWS
    InitializeCriticalSection( m );
#else ANDROID
    pthread_mutex_init( m, NULL );
#endif
}


void MutexDestroy( Mutex* m )
{
#ifdef WINDOWS
    DeleteCriticalSection( m );
#else ANDROID
    pthread_mutex_destroy( m, NULL );
#endif
}

void MutexLock( Mutex* m )
{
#ifdef WINDOWS
    EnterCriticalSection( m );
#else ANDROID
    pthread_mutex_lock( m );
#endif
}

void MutexUnlock( Mutex* m )
{
#ifdef WINDOWS
    LeaveCriticalSection( m );
#else ANDROID
    pthread_mutex_unlock( m );
#endif
}
Run Code Online (Sandbox Code Playgroud)

I tried modifying MutexCreate to make error-checking and recursive mutexes, but it didn't matter. I wasn't even getting errors or log output either, so either that means my mutex code is just fine, or the errors/logs weren't being shown. How exactly does the OS notify you of bad mutex usage?

The engine makes heavy use of static variables, including mutexes. I can't see how, but is that a problem? I doubt it because I modified lots of mutexes to be allocated on the heap instead, and the same behavior occurred. But that may be because I missed some static mutexes. I'm probably grasping at straws here.

I read several references including:

http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutex_init.html

http://www.embedded-linux.co.uk/tutorial/mutex_mutandis

http://linux.die.net/man/3/pthread_mutex_init

Android NDK Mutex

Android NDK problem pthread_mutex_unlock issue

fad*_*den 2

“错误检查”互斥体将检查一些事情(例如尝试递归地使用非递归互斥体),但没什么了不起的。

您说“真实代码检查所有返回值”,因此如果任何 pthread 调用返回非零值,您的代码可能会爆炸。(不知道为什么你的 pthread_mutex_destroy 需要两个参数;假设复制和粘贴错误。)

pthread 代码在 Android 中广泛使用,并且没有已知的挂起,因此问题不太可能出现在 pthread 实现本身中。

互斥体的当前实现适合 32 位,因此如果您打印*(pthread_mutex_t* mut)为整数,您应该能够弄清楚它处于什么状态(从技术上讲,它在过去某个时刻处于什么状态)。bionic/libc/bionic/pthread.c 中的定义是:

/* a mutex is implemented as a 32-bit integer holding the following fields
 *
 * bits:     name     description
 * 31-16     tid      owner thread's kernel id (recursive and errorcheck only)
 * 15-14     type     mutex type
 * 13        shared   process-shared flag
 * 12-2      counter  counter of recursive mutexes
 * 1-0       state    lock state (0, 1 or 2)
 */
Run Code Online (Sandbox Code Playgroud)

“快速”互斥体的类型为 0,并且不设置该tid字段。事实上,通用互斥锁的值为 0(未持有)、1(持有)或 2(持有,有争用)。如果您看到一个快速互斥体,其值不属于其中之一,则很可能有什么东西出现并踩踏了它。

这也意味着,如果您将程序配置为使用递归互斥体,您可以通过拉出位来查看哪个线程持有互斥体(通过在 trylock 指示您即将停止时打印互斥体值,或者使用 gdb 转储状态)在挂起的进程上)。再加上 的输出ps -t,将使您知道锁定互斥体的线程是否仍然存在。