recvmsg返回EDEADLK?

cos*_*us0 3 c sockets linux linux-kernel

我有套接字族PF_PACKET类型SOCK_RAW.通过recvmsg和poll()读取的消息.我随后和周期性地得到了recvmsg返回EDEADLK.我尝试用下一个代码调试这个问题.我尝试调查阻止我的套接字描述符的人和方式.可能有人知道其他方法怎么调试这种情况?正如我理解的那样

EDEADLK资源死锁避免了.试图锁定可能导致死锁情况的系统资源.在我的情况下,它是套接字文件描述符锁定问题.分配系统资源套接字文件描述符会导致死锁情况.系统不保证它会注意到所有这些情况.这个错误意味着你很幸运,系统注意到了.

error = recvmsg(fd, &msghdr, flags);
msghdr_flags = msg_hdr.msg_flags;

void recvmsg_errno(int fd, unsigned int msghdr_flags, int flags, int error)
{
    struct flock fl;
    memset(&fl, 0x0, sizeof(fl));

    printf("recvmsg return: %d errno: %d description: %s\n", error, errno,
                    strerror(errno));

    printf("recvmsg flags: %x %d\n", flags, flags);
    if (flags & MSG_ERRQUEUE) printf("MSG_ERRQUEUE\n");
    if (flags & MSG_OOB) printf("MSG_OOB\n");
    if (flags & MSG_PEEK) printf("MSG_PEEK\n");
    if (flags & MSG_TRUNC) printf("MSG_TRUNC\n");
    if (flags & MSG_WAITALL) printf("MSG_WAITALL\n");

    printf("msghdr flags: %x %d\n", msghdr_flags, msghdr_flags);
    if (msghdr_flags & MSG_EOR) printf("MSG_EOR\n");           //indicates end-of-record; SOCK_SEQPACKET
    if (msghdr_flags & MSG_TRUNC) printf("MSG_TRUNC\n");       //discarded datagram was larger than the buffer supplied
    if (msghdr_flags & MSG_CTRUNC) printf("MSG_CTRUNC\n");     //control data were discarded due to lack of space in the buffer
    if (msghdr_flags & MSG_OOB) printf("MSG_OOB\n");           //out-of-band data were received
    if (msghdr_flags & MSG_ERRQUEUE) printf("MSG_ERRQUEUE\n"); //no data received; extended error from the socket error queue

    flags = 0;
    flags = fcntl(fd, F_GETLK, &fl); {
            printf("F_GETLK: 0x%x\n", flags);
            printf("l_start: %x l_len: %x l_pid: %d l_type: %x l_whence: %x\n",
                            fl.l_start,   //Starting offset for lock
                            fl.l_len,     //Number of bytes to lock
                            fl.l_pid,     //PID of process blocking our lock (F_GETLK only)
                            fl.l_type,    //Type of lock: F_RDLCK, F_WRLCK, F_UNLCK
                            fl.l_whence); // How to interpret l_start: SEEK_SET, SEEK_CUR, SEEK_END
    }//Get the first lock which blocks the lock description

    flags = 0;
    flags = fcntl(fd, F_GETFD); {
            printf("F_GETFD: 0x%x\n", flags);
    }//Get the file descriptor flags

    flags = 0;
    flags = fcntl(fd, F_GETFL); {
            printf("F_GETFL: 0x%x\n", flags);
            if (flags & O_NONBLOCK) printf("O_NONBLOCK\n");
            if (!(flags & O_NONBLOCK)) printf("BLOCK\n");
            if (flags & O_APPEND) printf("O_APPEND\n");
    }//Get the file status flags and file access modes

    flags = 0;
    flags = fcntl(fd, F_GETOWN); {
            printf("F_GETOWN: %d\n", flags);
    }//Socket, set the process or process group ID specified to receive SIGURG signals when out-of-band data is available.
}
Run Code Online (Sandbox Code Playgroud)

Nom*_*mal 6

如果recvmsg()返回0,则不表示发生错误.这意味着另一端关闭了连接.

在对该问题的后续评论中,OP提到他们使用过if ((error = recvmsg()) <= 0) {return -errno;}.这是错的.当另一端关闭连接时,recvmsg()返回零,无需设置errno.即使使用变量named error也是错误的,因为函数返回接收的字节数.

换句话说,OP看到一个旧的,errno == EDEADLK来自其他早期失败的锁定函数的陈旧,并处理另一端错误地关闭连接的情况.

(errno仅在发生错误时设置;没有库函数将其清除为零,因为这可能导致在某些情况下隐藏错误.)