Nginx反向代理性能低下

Question

Nginx反向代理性能低下

Alm*_*mog 0 nginx performance-testing tomcat7 centos7 nginx-reverse-proxy

尝试配置 Nginx 有两个目的：

反向代理将请求重定向到本地 tomcat 服务器（mcat 监听的端口 443 到 10443）
将请求镜像到后端服务器以进行分析

由于我们使用默认配置和镜像指令时遇到了非常低的性能，因此我们决定尝试使用反向代理来检查是否对服务器有影响，并且确实看起来 nginx 将流量限制了几乎一半（我们正在使用 Locust和 Jmeter 作为加载工具）

Nginx 版本：1.19.4

尝试了10 个提高 10 倍应用程序性能的技巧并调整 NGINX 性能，但没有效果。运行 nginx 和 tomcat 的机器应该足够强大（EC2 c5.4XLarge），我们没有看到资源缺乏，但网络上限更多。TIME_WAIT 连接数非常高 (20k-40k)

从机器角度：

增加网络端口范围（1024 65300）
降低 tcp_fin_timeout (15ms)
将最大 FD 增加到最大

Nginx视角（之后添加nginx.conf）：

keepalive_requests 100000；keepalive_timeout 1000；
worker_processes 10（16 是 CPU 数量）
工人连接数 3000；
worker_rlimit_nofile 100000；

nginx.conf：

user  nginx;
worker_processes 10;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;


worker_rlimit_nofile 100000;
events {
   worker_connections  3000;
}


http {
   include       /etc/nginx/mime.types;
   default_type  application/octet-stream;

   log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                  '$status $body_bytes_sent "$http_referer" '
                  '"$http_user_agent" "$http_x_forwarded_for"';

   log_format  main_ext  '$remote_addr - $remote_user [$time_local] "$request" '
                  '$status $body_bytes_sent "$http_referer" '
                  '"$http_user_agent" "$http_x_forwarded_for" '
                  '"$host" sn="$server_name" '
                  'rt=$request_time '
                  'ua="$upstream_addr" us="$upstream_status" '
                  'ut="$upstream_response_time" ul="$upstream_response_length" '
                  'cs=$upstream_cache_status' ;


   keepalive_requests 100000;
   keepalive_timeout 1000;

   ssl_session_cache  shared:SSL:10m;

   sendfile        on;
   #tcp_nopush     on;

   #gzip  on;

   include /etc/nginx/conf.d/*.conf;

   upstream local_host {
       server 127.0.0.1:10443;
       keepalive 128;
   }

   server {
      listen 443 ssl;

      ssl_certificate /etc/ssl/nginx/crt.pem;
      ssl_certificate_key /etc/ssl/nginx/key.pem;

      location / {
  #       mirror /mirror;
      proxy_set_header Host $host;
      proxy_pass https://local_host$request_uri;
    }
    # Mirror configuration
      location = /mirror {
          internal;
          proxy_set_header Host test-backend-dns;
          proxy_http_version 1.1;
          proxy_set_header Connection "";
          proxy_connect_timeout 3s;
          proxy_read_timeout 100ms;
          proxy_send_timeout 100s;
          proxy_pass https://test-backend-ip:443$request_uri;
         }
       }
 }

Run Code Online (Sandbox Code Playgroud)

还使用 Amplify 代理进行监控，看起来连接数符合预期的请求和连接数，但实际请求数很低。放大监听输出

对于 Nginx 来说这似乎是一个简单的任务，但有些东西配置错误。谢谢您的回答

Answer 1

Alm*_*mog 6

经过多次尝试和方法解决之后，我们得出的结论是，使用 nginx 时应用程序的响应时间更长。

我们的假设以及最终如何克服这个问题是 SSL 终止。从资源和时间角度来看，这是一项昂贵的操作。

我们所做的是让 nginx（它能够处理比我们所用的负载高得多的负载，约 4k RPS）单独负责 SSL 终止，并且我们更改了 tomcat 应用程序配置，使其能够侦听HTTP 请求而不是 HTTPS。这极大地减少了打包并从服务器获取重要资源的 TIME_WAIT 连接。

nginx、tomcat 和内核的最终配置：

Linux机器配置：

- /proc/sys/net/ipv4/ip_local_port_range - set to 1024 65535
  (allows more ports hence ---> more connections)
- sysctl net.ipv4.tcp_timestamps=1 
  (""..reduce performance spikes related to timestamp generation..")
- sysctl net.ipv4.tcp_tw_recycle=0
  (This worked for us. Should be tested with/without tcp_tw_reuse)
- sysctl net.ipv4.tcp_tw_reuse=1
  (Same as tw_recycle)
- sysctl net.ipv4.tcp_max_tw_buckets=10000
  (self explanatory)

Run Code Online (Sandbox Code Playgroud)

Redhat 对 tcp_timeouts conf 的解释

汤姆猫配置：

<Executor name="tomcatThreadPool" namePrefix="catalina-exec-"
          maxThreads="4000"
          minSpareThreads="10"
 />


 <!-- A "Connector" using the shared thread pool - NO SSL -->
  <Connector executor="tomcatThreadPool"
          port="8080" protocol="HTTP/1.1"
          connectionTimeout="20000"
          acceptCount="5000"
          pollerThreadCount="16"
          acceptorThreadCount="16"
          redirectPort="8443"
  />

Run Code Online (Sandbox Code Playgroud)

Nginx具体性能参数配置：

main directive:
- worker_processes auto;
- worker_rlimit_nofile 100000;

events directive:
- worker_connections  10000; (we think can be lower)
- multi_accept on;

http directive:
- keepalive_requests 10000;
- keepalive_timeout 10s;
- access_log off;
- ssl_session_cache   shared:SSL:10m;
- ssl_session_timeout 10m;

Run Code Online (Sandbox Code Playgroud)

确实有助于理解方程式的两点：Nginx 和 tomcat。

我们使用 jmx 指标来了解 tomcat 上发生的情况以及我们应用程序中的 prometheus 指标。以及Amplify 代理来监控 nginx 行为。

希望对任何人都有帮助。

归档时间：	5 年，1 月前
查看次数：	7201 次
最近记录：	5 年前