我们最近遇到了流量高峰,虽然规模不大,但导致 haproxy 将一个 CPU 核心最大化(并且服务器变得无响应)。我猜我在做一些低效的配置,所以想问问所有的 haproxy 专家,他们是否愿意批评我下面的配置文件(主要是从性能角度)。
该配置旨在分布在一组 http 应用程序服务器、一组处理 websockets 连接的服务器(在不同端口上有多个单独的进程)和一个静态文件 web 服务器之间。从性能问题来看,它运行良好。(一些细节已被编辑。)
您能提供的任何指导将不胜感激!
HAProxy v1.4.8
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
daemon
maxconn 100000
log 127.0.0.1 local0 notice
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
log global
mode http
option httplog
option httpclose #http://serverfault.com/a/104782/52811
timeout connect 5000ms
timeout client 50000ms
timeout server 5h #long timeouts to stop WS drops - when v1.5 is stable, use 'timeout tunnel';
#---------------------------------------------------------------------
# FRONTEND
#---------------------------------------------------------------------
frontend public
bind *:80
maxconn 100000
reqidel ^X-Forwarded-For:.* #Remove any x-forwarded-for headers
option forwardfor #Set the forwarded for header (needs option httpclose)
default_backend app
redirect prefix http://xxxxxxxxxxxxxxxxx code 301 if { hdr(host) -i www.xxxxxxxxxxxxxxxxxxx }
timeout client 5h #long timeouts to stop WS drops - when v1.5 is stable, use 'timeout tunnel';
# ACLs
##########
acl static_request hdr_beg(host) -i i.
acl static_request hdr_beg(host) -i static.
acl static_request path_beg /favicon.ico /robots.txt
acl test_request hdr_beg(host) -i test.
acl ws_request hdr_beg(host) -i ws
# ws11
acl ws11x1_request hdr_beg(host) -i ws11x1
acl ws11x2_request hdr_beg(host) -i ws11x2
acl ws11x3_request hdr_beg(host) -i ws11x3
acl ws11x4_request hdr_beg(host) -i ws11x4
acl ws11x5_request hdr_beg(host) -i ws11x5
acl ws11x6_request hdr_beg(host) -i ws11x6
# ws12
acl ws12x1_request hdr_beg(host) -i ws12x1
acl ws12x2_request hdr_beg(host) -i ws12x2
acl ws12x3_request hdr_beg(host) -i ws12x3
acl ws12x4_request hdr_beg(host) -i ws12x4
acl ws12x5_request hdr_beg(host) -i ws12x5
acl ws12x6_request hdr_beg(host) -i ws12x6
# Which backend....
###################
use_backend static if static_request
#ws11
use_backend ws11x1 if ws11x1_request
use_backend ws11x2 if ws11x2_request
use_backend ws11x3 if ws11x3_request
use_backend ws11x4 if ws11x4_request
use_backend ws11x5 if ws11x5_request
use_backend ws11x6 if ws11x6_request
#ws12
use_backend ws12x1 if ws12x1_request
use_backend ws12x2 if ws12x2_request
use_backend ws12x3 if ws12x3_request
use_backend ws12x4 if ws12x4_request
use_backend ws12x5 if ws12x5_request
use_backend ws12x6 if ws12x6_request
#---------------------------------------------------------------------
# BACKEND - APP
#---------------------------------------------------------------------
backend app
timeout server 50000ms #To counter the WS default
mode http
balance roundrobin
option httpchk HEAD /upchk.txt
server app1 app1:8000 maxconn 100000 check
server app2 app2:8000 maxconn 100000 check
server app3 app3:8000 maxconn 100000 check
server app4 app4:8000 maxconn 100000 check
#---------------------------------------------------------------------
# BACKENDs - WS
#---------------------------------------------------------------------
#Server ws11
backend ws11x1
server ws11 ws11:8001 maxconn 100000
backend ws11x2
server ws11 ws11:8002 maxconn 100000
backend ws11x3
server ws11 ws11:8003 maxconn 100000
backend ws11x4
server ws11 ws11:8004 maxconn 100000
backend ws11x5
server ws11 ws11:8005 maxconn 100000
backend ws11x6
server ws11 ws11:8006 maxconn 100000
#Server ws12
backend ws12x1
server ws12 ws12:8001 maxconn 100000
backend ws12x2
server ws12 ws12:8002 maxconn 100000
backend ws12x3
server ws12 ws12:8003 maxconn 100000
backend ws12x4
server ws12 ws12:8004 maxconn 100000
backend ws12x5
server ws12 ws12:8005 maxconn 100000
backend ws12x6
server ws12 ws12:8006 maxconn 100000
#---------------------------------------------------------------------
# BACKEND - STATIC
#---------------------------------------------------------------------
backend static
server static1 static1:80 maxconn 40000
Run Code Online (Sandbox Code Playgroud)
100,000 个连接太多了……你有那么多推力吗?如果是这样......可能会拆分前端,使其绑定在一个用于静态内容的 ip 和一个用于应用程序内容的 ip 上,然后将静态和应用程序变体作为单独的 haproxy 进程运行(假设您在服务器上有第二个核心/cpu) ...
如果不出意外,它会将使用范围缩小到应用程序或静态流......
如果我正确地记住了我的网络 101 类...... HaProxy 应该无法100,000
连接到ws12:8001
或任何其他后端主机:端口,因为 ~65536 端口限制更接近28232
大多数系统 ( cat /proc/sys/net/ipv4/ip_local_port_range
)。您可能正在耗尽本地端口,这反过来可能导致 CPU 在等待端口释放时挂起。
也许将每个后端的最大连接数降低到接近 28000 会缓解这个问题?或者将本地端口范围更改为更具包容性?