从上游读取响应标头时,nginx recv() 失败(104:对等连接重置)

Lat*_*san 7 linux php nginx magento

我最近将我的 magento 从 1.5 升级到 1.9,当我将某个产品添加到购物篮时,我开始收到此错误:502 Bad Gateway

var/log/文件夹中没有日志条目: 在此处输入图片说明

因此,我查看了我的 nginx 错误,并在nginx-errors.log 中发现了以下条目:

2015/04/09 10:58:03 [error] 15208#0: *3 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 46.xxx.xxx.xxx, server: dev.my-domain.co.uk, request: "POST /checkout/cart/add/uenc/aHR0cDovL2Rldi5zYWx2ZW8uY28udWsvdGludGktYmF0aC1wYWludGluZy1zb2FwLTcwbWwuaHRtbD9fX19TSUQ9VQ,,/product/15066/form_key/eYLc3lQ35BSrk6Pa/ HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fcgi-www-data.sock:", host: "dev.my-domain.co.uk", referrer: "http://dev.my-domain.co.uk/tinti-bath-painting-soap-70ml.html"
2015/04/09 11:04:42 [error] 15208#0: *13 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 46.xxx.xxx.xxx, server: dev.my-domain.co.uk, request: "POST /checkout/cart/add/uenc/aHR0cDovL2Rldi5zYWx2ZW8uY28udWsvdGludGktYmF0aC1wYWludGluZy1zb2FwLTcwbWwuaHRtbD9fX19TSUQ9VQ,,/product/15066/form_key/eYLc3lQ35BSrk6Pa/ HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fcgi-www-data.sock:", host: "dev.my-domain.co.uk", referrer: "http://dev.my-domain.co.uk/tinti-bath-painting-soap-70ml.html"
2015/04/09 11:05:03 [error] 15208#0: *16 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 46.xxx.xxx.xxx, server: dev.my-domain.co.uk, request: "POST /checkout/cart/add/uenc/aHR0cDovL2Rldi5zYWx2ZW8uY28udWsvdGludGktYmF0aC1wYWludGluZy1zb2FwLTcwbWwuaHRtbD9fX19TSUQ9VQ,,/product/15066/form_key/eYLc3lQ35BSrk6Pa/ HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fcgi-www-data.sock:", host: "dev.my-domain.co.uk", referrer: "http://dev.my-domain.co.uk/tinti-bath-painting-soap-70ml.html"
2015/04/09 11:12:07 [error] 15273#0: *1 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 46.xxx.xxx.xxx, server: dev.my-domain.co.uk, request: "POST /checkout/cart/add/uenc/aHR0cDovL2Rldi5zYWx2ZW8uY28udWsvdGludGktYmF0aC1wYWludGluZy1zb2FwLTcwbWwuaHRtbD9fX19TSUQ9VQ,,/product/15066/form_key/eYLc3lQ35BSrk6Pa/ HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fcgi-www-data.sock:", host: "dev.my-domain.co.uk", referrer: "http://dev.my-domain.co.uk/tinti-bath-painting-soap-70ml.html"
Run Code Online (Sandbox Code Playgroud)

我已经在自定义 LEMP 堆栈上安装了 magento,以下是配置:

只有当我在升级后的 magento 中将特定产品添加到购物篮时才会出现此错误,并且每次发生错误时,我都可以在public_html文件夹中看到一个core.XXXXX文件(大约 350mb)。

知道为什么我的 php-fpm 会像这样崩溃吗?我怎样才能找到原因并解决它?

当我运行dmesg命令时,这是我的 Linux (CentOS) 服务器上的最后一个条目:

php-fpm[14862]: segfault at 7fff38236ff8 ip 00000000005c02ba sp 00007fff38237000 error 6 in php-fpm[400000+325000]
php-fpm[15022]: segfault at 7fff38351ff0 ip 00000000005bf6e5 sp 00007fff38351fb0 error 6 in php-fpm[400000+325000]
php-fpm[15021]: segfault at 7fff38351ff0 ip 00000000005bf6e5 sp 00007fff38351fb0 error 6 in php-fpm[400000+325000]
php-fpm[15156]: segfault at 7fff38351ff0 ip 00000000005bf6e5 sp 00007fff38351fb0 error 6 in php-fpm[400000+325000]
php-fpm[15024]: segfault at 7fff38351ff0 ip 00000000005bf6e5 sp 00007fff38351fb0 error 6 in php-fpm[400000+325000]
php-fpm[15223]: segfault at 7fff8d1d5fd8 ip 00000000005c02ba sp 00007fff8d1d5fe0 error 6 in php-fpm[400000+325000]
php-fpm[15222]: segfault at 7fff8d1d5fd8 ip 00000000005c02ba sp 00007fff8d1d5fe0 error 6 in php-fpm[400000+325000]
php-fpm[15225]: segfault at 7fff8d1d5fd8 ip 00000000005c02ba sp 00007fff8d1d5fe0 error 6 in php-fpm[400000+325000]
php-fpm[15227]: segfault at 7fff8d1d5fd8 ip 00000000005c02ba sp 00007fff8d1d5fe0 error 6 in php-fpm[400000+325000]
php-fpm[15362]: segfault at 7fff3118afd0 ip 00000000005c0ace sp 00007fff3118afa0 error 6 in php-fpm[400000+325000]
Run Code Online (Sandbox Code Playgroud)

我用 gdb 分析了核心转储,这是我在前两帧中看到的:http : //pastebin.com/raw.php?i=aPvB1sWv(对我来说没有多大意义)...

Tom*_*art 5

此类错误通常发生在服务器资源不足时,假设您运行的是最新的稳定版本php5-fpm

  1. 检查是否php5-fpm有足够的内存(没有oom-killer杀死进程)

  2. 磁盘上有足够的空间

  3. 确保检查服务器上的打开文件限制。您尤其对硬限制 ( -Hn)感兴趣:

    $ ulimit -Hn
    4096
    $ ulimit -Sn
    1024
    
    Run Code Online (Sandbox Code Playgroud)

检查服务器上当前打开的文件描述符的数量:

    sysctl fs.file-nr
    fs.file-nr = 1440       0       790328
Run Code Online (Sandbox Code Playgroud)

现代服务器能够处理许多文件,通常ulimits设置为不必要的低值。

然后检查nginx.conf,一开始有类似的东西:

    worker_processes 4;
    events {
      worker_connections 1024;
    }
Run Code Online (Sandbox Code Playgroud)

如果您为每个连接代理请求,则需要 2 个文件句柄。这意味着如果有很多连接,您会很快达到限制。

nginx 有一个worker_rlimit_nofile指令来限制每个工作进程打开的文件(顶级指令,如worker_processes 4;):

    worker_rlimit_nofile    1024;
Run Code Online (Sandbox Code Playgroud)

只需进行数学计算并计算使用所有连接时您需要多少个打开的文件描述符(有点极端的情况)。还要考虑在该服务器上运行的所有其他服务。