我有一个由 Hypnotoad 提供的应用程序,没有反向代理。它有 15 个工作人员,每个允许 2 个客户端。该应用程序通过 hypnotoad 在前台模式下启动。
我在 log/Production.log 中看到以下内容:
[Wed Apr 1 16:28:12 2015] [error] Worker 119914 has no heartbeat, restarting.
[Wed Apr 1 16:28:21 2015] [error] Worker 119910 has no heartbeat, restarting.
[Wed Apr 1 16:28:21 2015] [error] Worker 119913 has no heartbeat, restarting.
[Wed Apr 1 16:28:22 2015] [error] Worker 119917 has no heartbeat, restarting.
[Wed Apr 1 16:28:22 2015] [error] Worker 119909 has no heartbeat, restarting.
[Wed Apr 1 16:28:27 2015] [error] Worker 119907 has no heartbeat, restarting.
[Wed Apr 1 16:28:34 2015] [error] Worker 119905 has no heartbeat, restarting.
[Wed Apr 1 16:28:42 2015] [error] Worker 119904 has no heartbeat, restarting.
[Wed Apr 1 16:30:12 2015] [error] Worker 119912 has no heartbeat, restarting.
[Wed Apr 1 16:31:23 2015] [error] Worker 119918 has no heartbeat, restarting.
[Wed Apr 1 16:32:18 2015] [error] Worker 119911 has no heartbeat, restarting.
[Wed Apr 1 16:32:22 2015] [error] Worker 119916 has no heartbeat, restarting.
Run Code Online (Sandbox Code Playgroud)
然而,工人永远不会重新启动。
当我运行 strace 时,管理器进程似乎正在勇敢地尝试杀死(现已过期)工作人员:
Process 119878 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = 0
kill(119906, SIGKILL) = 0
kill(119917, SIGKILL) = 0
kill(119905, SIGKILL) = 0
kill(119910, SIGKILL) = 0
kill(119904, SIGKILL) = 0
kill(119914, SIGKILL) = 0
kill(119916, SIGKILL) = 0
kill(119908, SIGKILL) = 0
kill(119913, SIGKILL) = 0
kill(119915, SIGKILL) = 0
kill(119918, SIGKILL) = 0
kill(119912, SIGKILL) = 0
kill(119909, SIGKILL) = 0
kill(119911, SIGKILL) = 0
kill(119907, SIGKILL) = 0
stat("/xxx/xxx/xxx/hypnotoad.pid", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
poll([{fd=4, events=POLLIN|POLLPRI}], 1, 1000) = 0 (Timeout)
kill(119906, SIGKILL) = 0
kill(119917, SIGKILL) = 0
kill(119905, SIGKILL) = 0
kill(119910, SIGKILL) = 0
kill(119904, SIGKILL) = 0
kill(119914, SIGKILL) = 0
kill(119916, SIGKILL) = 0
kill(119908, SIGKILL) = 0
kill(119913, SIGKILL) = 0
kill(119915, SIGKILL) = 0
kill(119918, SIGKILL) = 0
kill(119912, SIGKILL) = 0
kill(119909, SIGKILL) = 0
kill(119911, SIGKILL) = 0
kill(119907, SIGKILL) = 0
stat("/xxx/xxx/xxx/hypnotoad.pid", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0
poll([{fd=4, events=POLLIN|POLLPRI}], 1, 1000^C <unfinished ...>
Process 119878 detached
Run Code Online (Sandbox Code Playgroud)
我如何进一步排除此问题以确定:
“Worker 31842没有心跳,正在重启”是什么意思?
只要它们接受新连接,所有内置预分叉 Web 服务器的工作进程就会定期向管理器进程发送心跳消息,以表明它们仍在响应。应用程序中的阻塞操作(例如无限循环)可以防止这种情况发生,并会强制受影响的工作线程在超时后重新启动。此超时默认为 20 秒,如果您的应用程序需要,可以使用 Mojo::Server::Prefork 中的属性“heartbeat_timeout”进行扩展。
| 归档时间: |
|
| 查看次数: |
1309 次 |
| 最近记录: |