带有后视功能的正则表达式无法使用 re.match

Question

带有后视功能的正则表达式无法使用 re.match

Los*_*het 4 python regex string lookbehind

以下python代码：

import re

line="http://google.com"
procLine = re.match(r'(?<=http).*', line)
if procLine.group() == "":
    print(line + ": did not match regex")
else:
    print(procLine.group())

Run Code Online (Sandbox Code Playgroud)

匹配不成功，输出如下错误：

回溯（最近一次调用）：文件“C:/Users/myUser/Documents/myScript.py”，第 5 行，如果 procLine.group() ==“”：AttributeError: 'NoneType' 对象没有属性 'group '

当我只用 .* 替换正则表达式时，它工作正常，这表明它是错误的正则表达式，但是，在https://regex101.com/ 上，当我测试我的正则表达式和 python 风格的字符串时，它似乎匹配得很好。

有任何想法吗？

Answer 1

cs9*_*s95 6

如果您将后视转换为非捕获组，这应该有效：

In [7]: re.match(r'(?:http://)(.*)', line)
Out[7]: <_sre.SRE_Match object; span=(0, 17), match='http://google.com'>

In [8]: _.group(1)
Out[8]: 'google.com'

Run Code Online (Sandbox Code Playgroud)

Lookbeind 不起作用的原因是 - 正如Rawing 提到的-re.match从字符串的开头开始查看，因此在字符串的开头查看后面没有意义。

如果您坚持使用后视，请切换到re.search：

In [10]: re.search(r'(?<=http://).*', line) Out[10]: <_sre.SRE_Match object; span=(7, 17), match='google.com'> In [11]: _.group() Out[11]: 'google.com'
Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，1 月前
查看次数：	1122 次
最近记录：	6 年，10 月前