亚马逊ELB无法提供响应

sat*_*shi 2 tomcat load-balancing amazon-web-services amazon-elb

我有一个在Amazon Web Services上运行的网站,该网站使用Elastic Beanstalk部署,并在单个EC2微实例上运行.这是一个临时环境,我是唯一可以访问它的人.使用Apache JMeter的,我模拟六个用户浏览的网站上,平均约在总每3秒(图像,CSS,JS等静态资源被CloudFront的投放,也不会做出对EC2实例的流量)的请求.

问题是,经过一段时间(通常在设置环境后30-60分钟),网站停止响应.我确信Tomcat仍然正常运行,因为我可以在日志(catalina.out)中看到cronjobs仍在执行中.似乎只有ELB无法提供响应.

分析日志时,Tomcat上根本没有错误(/opt/tomcat7/logs/tail_catalina.log或/opt/tomcat7/logs/catalina.out中没有错误).一旦网站无法访问,以下错误就会立即出现在/ etc/httpd/logs/elasticbeanstalk-error_log中:

[Thu Jun 14 20:26:42 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:26:42 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:26:50 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:26:50 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:27:20 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:27:20 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:27:43 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:27:43 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:27:50 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:27:50 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:28:20 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:28:20 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:28:42 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:28:42 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:28:50 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:28:50 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:29:20 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:29:20 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:29:42 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:29:42 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:29:50 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:29:50 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:30:20 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:30:20 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:30:43 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:30:43 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:30:50 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:30:50 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:31:20 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:31:20 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:31:43 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:31:43 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:31:50 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:31:50 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
[Thu Jun 14 20:32:20 2012] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:32:20 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
Run Code Online (Sandbox Code Playgroud)

...直到EC2实例最终终止(并自动启动新的实例).

如果我不提出任何请求(或者我做的更少),这个问题就不会发生.

任何帮助非常感谢.

谢谢!

gab*_*rtv 7

让我先假设一下:

  • 您的Tomcat应用程序应该在127.0.0.1:8999上侦听

如果是这样,日志事件:

[Thu Jun 14 20:26:42 2012] [error] (111)Connection refused: proxy: HTTP: attempt to   connect to 127.0.0.1:8999 (localhost) failed
[Thu Jun 14 20:26:42 2012] [error] ap_proxy_connect_backend disabling worker for (localhost)
Run Code Online (Sandbox Code Playgroud)

..suggest应用程序监听器死了.您可以通过以下方式确认:

curl -v http://127.0.0.1:8999/
Run Code Online (Sandbox Code Playgroud)

curl当网站正常运行时,该命令应该返回有效的HTTP响应,并且可能会返回一个Connection refusedcouldn't connect to host当您遇到中断时.您还可以使用以下命令检查应用程序端口上的有效侦听器:

netstat -an | grep LISTEN | grep 8999
Run Code Online (Sandbox Code Playgroud)

应用程序监听器可能会死的原因有很多,包括但不限于:

  • JVM的硬崩溃(用于ps查看JVM进程是否仍在运行)
  • 应用程序的软崩溃(查看Tomcat应用程序日志)
  • 用完文件描述符(使用lsof | wc -l和比较ulimit -n应用程序用户)

但是,大多数错误都应该导致将错误消息写入JVM进程stderr,这通常是记录的.这是最好看的地方.如果所有其他方法都失败了,您可能希望尝试在启用调试日志记录的情况下在前台运行Tomcat应用程序.