任何人都可以解释这个正则表达式

Question

任何人都可以解释这个正则表达式

我只需要有人来纠正我对这个正则表达式的理解,这就像是一个匹配HTML标签的权宜之计.

< (?: "[^"]*" ['"]* | '[^']*'['"]*|[^'">])+ >

Run Code Online (Sandbox Code Playgroud)

我的理解 -

< - 匹配标记打开符号
(?: - 不明白这里发生了什么.这些符号是什么意思？
"[^"]*['"]*双引号中的任意字符串.还有什么事吗？
'[^']*'['"]* - 单引号中的一些字符串
[^'">] - 除""以外的任何字符.

因此它是一个'<'符号.用双引号或单引号中的字符串或任何其他包含'或>的字符串,重复一次或多次,后跟'>'.
这是我能做出的最好的.

Answer 1

Mar*_*der 5

<       # literally just an opening tag followed by a space
(       # the bracket opens a subpattern, it's necessary as a boundary for
        # the | later on
?:      # makes the just opened subpattern non-capturing (so you can't access it
        # as a separate match later
"       # literally "
[^"]    # any character but " (this is called a character class)
*       # arbitrarily many of those (as much as possible)
"       # literally "
['"]    # either ' or "
*       # arbitrarily many of those (and possible alternating! it doesn't have
        # to be the same character for the whole string)
|       # OR
'       # literral *
[^']    # any character but ' (this is called a character class)
*       # arbitrarily many of those (as much as possible)
'       # literally "
['"]*   # as above
|       # OR
[^'">]  # any character but ', ", >
)       # closes the subpattern
+       # arbitrarily many repetitions but at least once
>       # closing tag

Run Code Online (Sandbox Code Playgroud)

请注意,正则表达式中的所有空格都被视为与任何其他字符一样.它们只匹配一个空间.

还要特别注意^字符类的开头.它不被视为单独的字符,而是反转整个字符类.

我也可以(强制性地)添加,正则表达式不适合解析HTML.

归档时间：	13 年，1 月前
查看次数：	60 次
最近记录：	13 年，1 月前