我试图创建一个正则表达式来捕捉[[xyz | asd]],但不是[[xyz]]在文本中:
'''Diversity Day'''" is the second episode of the [[The Office (U.S. season 1)]|first season]] of the American [[comedy]] [[television program|television series]] ''[[The Office (U.S. TV series)|The Office]]'', and the show's second episode overall. Written by [[B. J. Novak]] and directed by [[Ken Kwapis]], it first aired in the United States on March 29, 2005, on [[NBC]]. The episode guest stars ''Office'' consulting producer [[Larry Wilmore]] as [[List_of_characters_from_The_Office_(US)#Mr._Brown|Mr. Brown]].
应捕获以下结果:
[[The Office (U.S. season 1)]|first season]] <-- keep in mind of the "]" before "|", "]" in that case is a literal character not a breaking one "]]"
[[television program|television series]]
[[The Office (U.S. TV series)|The Office]]
[[List_of_characters_from_The_Office_(US)#Mr._Brown|Mr. Brown]]
Run Code Online (Sandbox Code Playgroud)
我试图使用的是:
\[\[([^|]+)\|([^|]+)\]\]
Run Code Online (Sandbox Code Playgroud)
但我无法弄清楚如何忽略"|" 和"]]"在小组中.[^ |(]])]不会工作,因为它不匹配"]]"但只有字符"]"(它需要是整个单词)
请帮忙,谢谢!
你可以在这里依赖一个顽固的贪婪令牌:
\[\[((?:(?!]]).)*)\|((?:(?!]]).)*)]]
Run Code Online (Sandbox Code Playgroud)
请参阅正则表达式演示
细节:
\[\[- 2个[符号((?:(?!]]).)*)- 第1组(注意这里*可以变成懒惰*?,特别是如果第一部分比第二部分短)捕获:
(?:(?!]]).)* - 零个或多个序列
.- 任何字符(但是换行符,RegexOptions.Singleline如果字符串跨越多行,请使用模式)...(?!]])- 这不是一个]]序列(即如果与另一个序列.不匹配)]]\| - 文字 |((?:(?!]]).)*) - 第2组捕获与第2组相同的子模式]]- 2字面]结尾.这个正则表达式的一个更有效的"展开"版本是:
\[\[([^]|]*(?:](?!])[^]|]*)*)\|([^]]*(?:](?!])[^]]*)*)]]
Run Code Online (Sandbox Code Playgroud)
请参阅正则表达式演示.此正则表达式将第一个|视为内部字段分隔符.请参阅我的其他答案,了解如何展开淬火贪婪的代币.
| 归档时间: |
|
| 查看次数: |
311 次 |
| 最近记录: |