用于解析HttpLog格式的正则表达式模式

Thi*_*yya 7 regex logging haproxy

我正在为HttpLogFormat中的String寻找一个正则表达式模式匹配器.该日志由haproxy生成.以下是此格式的示例字符串.

Feb 6 12:14:14 localhost haproxy[14389]: 10.0.1.2:33317 [06/Feb/2009:12:14:14.655] http-in static/srv1 10/0/30/69/109 200 2750 - - ---- 1/1/1/1/0 0/0 {1wt.eu} {} "GET /index.html HTTP/1.1"
Run Code Online (Sandbox Code Playgroud)

HttpLogFormat提供了格式说明.任何帮助表示赞赏.

我试图获得该行中包含的各种信息.以下是字段:

  1. process_name'['pid']:'
  2. client_ip':'client_port
  3. '['accept_date']'
  4. frontend_name
  5. backend_name'/'server_name
  6. Tq'/'Tw'/'Tc'/'Tr'/'Tt*
  7. STATUS_CODE
  8. bytes_read缓存
  9. captured_request_cookie
  10. captured_response_cookie
  11. termination_state
  12. actconn'/'feconn'/'beconn'/'srv_conn'/'重试
  13. srv_queue'/'backend_queue
  14. '{'captured_request_headers*'}'
  15. '{'captured_response_headers*'}'
  16. '''http_request'''

Mik*_*ark 5

正则表达式:

^(\w+ \d+ \S+) (\S+) (\S+)\[(\d+)\]: (\S+):(\d+) \[(\S+)\] (\S+) (\S+)/(\S+) (\S+) (\S+) (\S+) *(\S+) (\S+) (\S+) (\S+) (\S+) \{([^}]*)\} \{([^}]*)\} "(\S+) ([^"]+) (\S+)" *$
Run Code Online (Sandbox Code Playgroud)

结果:

Group 1:    Feb 6 12:14:14
Group 2:    localhost
Group 3:    haproxy
Group 4:    14389
Group 5:    10.0.1.2
Group 6:    33317
Group 7:    06/Feb/2009:12:14:14.655
Group 8:    http-in
Group 9:    static
Group 10:   srv1
Group 11:   10/0/30/69/109
Group 12:   200
Group 13:   2750
Group 14:   -
Group 15:   -
Group 16:   ----
Group 17:   1/1/1/1/0
Group 18:   0/0
Group 19:   1wt.eu
Group 20:   
Group 21:   GET
Group 22:   /index.html
Group 23:   HTTP/1.1
Run Code Online (Sandbox Code Playgroud)

我使用RegexBuddy来编写复杂的正则表达式。