hai*_* he 44 kill signals process-management sigkill
当我过去killall -9 name
杀死一个程序时,状态变成僵尸。几分钟后,它真的停止了。那么,在那几分钟里发生了什么?
tel*_*coM 77
该程序实际上从未收到 SIGKILL 信号,因为 SIGKILL 完全由操作系统/内核处理。
当发送特定进程的 SIGKILL 时,内核的调度程序立即停止为该进程提供更多 CPU 时间来运行用户空间代码。如果在调度程序做出此决定时进程有任何线程在其他 CPU/内核上执行用户空间代码,则这些线程也将停止。(在单核系统中,这过去要简单得多:如果系统中唯一的 CPU 核在运行调度程序,那么根据定义,它不会同时运行进程!)
如果进程/线程在 SIGKILL 时正在执行内核代码(例如系统调用,或与内存映射文件相关的 I/O 操作),它会变得有点棘手:只有一些系统调用是可中断的,所以内核在内部将进程标记为处于特殊的“死亡”状态,直到系统调用或 I/O 操作得到解决。解决这些问题的 CPU 时间将照常安排。可中断的系统调用或 I/O 操作将检查调用它们的进程是否在任何合适的停止点死亡,并在这种情况下提前退出。不间断的操作将进入完成状态,并在返回用户空间代码之前检查“死亡”状态。
一旦任何进程内内核例程被解析,进程状态就会从“死亡”变为“死亡”,内核开始清理它,类似于程序正常退出时。清理完成后,将分配一个大于 128 的结果代码(表示进程被信号杀死;有关混乱的详细信息,请参阅此答案),进程将转换为“僵尸”状态. 被杀死进程的父进程将收到一个 SIGCHLD 信号通知。
因此,进程本身永远不会有机会实际处理它收到 SIGKILL 的信息。
When a process is in a "zombie" state it means the process is already dead, but its parent process has not yet acknowledged this by reading the exit code of the dead process using the wait(2)
system call. Basically the only resource a zombie process is consuming any more is a slot in the process table that holds its PID, the exit code and some other "vital statistics" of the process at the time of its death.
If the parent process dies before its children, the orphaned child processes are automatically adopted by PID #1, which has a special duty to keep calling wait(2)
so that any orphaned processes won't stick around as zombies.
If it takes several minutes for a zombie process to clear, it suggests that the parent process of the zombie is struggling or not doing its job properly.
There is a tongue-in-cheek description on what to do in case of zombie problems in Unix-like operating systems: "You cannot do anything for the zombies themselves, as they are already dead. Instead, kill the evil zombie master!" (i.e. the parent process of the troublesome zombies)