关于如何在 Nginx 位置块部分使用正则表达式的指南？

Question

关于如何在 Nginx 位置块部分使用正则表达式的指南？

int*_*ika 24 regex nginx regex-group nginx-location nginx-config

Nginx 正则表达式位置语法

正则表达式可以与 Nginx 位置块部分一起使用，这是通过 PCRE 引擎实现的。

由于没有完整记录，此功能究竟支持什么？

Answer 1

int*_*ika 59

Nginx 位置：

Nginx 位置块部分具有搜索顺序、修饰符、隐式匹配类型以及是否在匹配时停止搜索的隐式开关。以下数组为正则表达式描述了它。


# --------------------------------------------------------------------------------------------------------------------------------------------
# Search-Order       Modifier       Description                                                        Match-Type        Stops-search-on-match
# --------------------------------------------------------------------------------------------------------------------------------------------
#     1st               =           The URI must match the specified pattern exactly                  Simple-string              Yes
#     2nd               ^~          The URI must begin with the specified pattern                     Simple-string              Yes
#     3rd             (None)        The URI must begin with the specified pattern                     Simple-string               No
#     4th               ~           The URI must be a case-sensitive match to the specified Rx      Perl-Compatible-Rx      Yes (first match)                 
#     4th               ~*          The URI must be a case-insensitive match to the specified Rx    Perl-Compatible-Rx      Yes (first match)
#     N/A               @           Defines a named location block.                                   Simple-string              Yes
# --------------------------------------------------------------------------------------------------------------------------------------------

Run Code Online (Sandbox Code Playgroud)

捕获组：

支持捕获组，表达式求值()，本例location ~ ^/(?:index|update)$匹配以example.com/index和example.com/update结尾的url

# ----------------------------------------------------------------------------------------- # () : Group/Capturing-group, capturing mean match and retain/output/use what matched # the patern inside (). the default bracket mode is "capturing group" while (?:) # is a non capturing group. example (?:a|b) match a or b in a non capturing mode # ----------------------------------------------------------------------------------------- # ?: : Non capturing group # ?= : Positive look ahead # ?! : is for negative look ahead (do not match the following...) # ?<= : is for positive look behind # ?<! : is for negative look behind # -----------------------------------------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
正斜杠：

不要与正则表达式的斜杠混淆\，在 nginx 中，正斜杠/用于匹配任何子位置，包括 none 示例location /。在正则表达式支持的上下文中，以下解释适用

# ----------------------------------------------------------------------------------------- # / : It doesn't actually do anything. In Javascript, Perl and some other languages, # it is used as a delimiter character explicitly for regular expressions. # Some languages like PHP use it as a delimiter inside a string, # with additional options passed at the end, just like Javascript and Perl. # Nginx does not use delimiter, / can be escaped with \/ for code portability # purpose BUT this is not required for nginx / are handled literally # (don't have other meaning than /) # -----------------------------------------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
斜线：

正则表达式特殊字符的第一个目的\是为了转义下一个字符；但请注意，在大多数情况下，\后跟一个字符具有不同的含义，此处提供了完整列表。

Nginx 不需要转义正斜杠，/它也不会像我们可以转义任何其他字符一样拒绝转义它。因此\/被翻译/匹配/。在 nginx 上下文中转义正斜杠的目的之一可能是为了代码可移植性。

其他正则表达式字符

这是可以使用的正则表达式的非详尽列表

# ----------------------------------------------------------------------------------------- # ~ : Enable regex mode for location (in regex ~ mean case-sensitive match) # ~* : case-insensitive match # | : Or # () : Match group or evaluate the content of () # $ : the expression must be at the end of the evaluated text # (no char/text after the match) $ is usually used at the end of a regex # location expression. # ? : Check for zero or one occurrence of the previous char ex jpe?g # ^~ : The match must be at the beginning of the text, note that nginx will not perform # any further regular expression match even if an other match is available # (check the table above); ^ indicate that the match must be at the start of # the uri text, while ~ indicates a regular expression match mode. # example (location ^~ /realestate/.*) # Nginx evaluation exactly this as don't check regexp locations if this # location is longest prefix match. # = : Exact match, no sub folders (location = /) # ^ : Match the beginning of the text (opposite of $). By itself, ^ is a # shortcut for all paths (since they all have a beginning). # .* : Match zero, one or more occurrence of any char # \ : Escape the next char # . : Any char # * : Match zero, one or more occurrence of the previous char # ! : Not (negative look ahead) # {} : Match a specific number of occurrence ex. [0-9]{3} match 342 but not 32 # {2,4} match length of 2, 3 and 4 # + : Match one or more occurrence of the previous char # [] : Match any char inside # --------------------------------------------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
例子：

location ~ ^/(?:index)\.php(?:$|/)

location ~ ^\/(?:core\/img\/background.png|core\/img\/favicon.ico)(?:$|\/)

location ~ ^/(?:index|core/ajax/update|ocs/v[12]|status|updater/.+|oc[ms]-provider/.+)\.php(?:$|/)

即使NGINX官方网站和文档也没有提供如此详细和全面的NGINX使用正则表达式的指南。感谢您的巨大努力。 (10认同)

多一个正则表达式字符，“任何不在里面的”，`[^xyz]` = 所有不是 xyz 的内容 (3认同)

@intika 非常感谢！您还有可以使用捕获组的示例吗？例如在 proxy_pass 指令中 set_request_body ？ (2认同)

归档时间：	5 年，9 月前
查看次数：	28072 次
最近记录：	4 年，7 月前