Python Regex - 在html标签之间查找字符串

>>> a = '<b>Bold Stuff</b>'
>>> 
>>> import re
>>> re.findall(r'>(.+?)<', a)
['Bold Stuff']
>>> re.findall(r'>(.*?)<', a)[0] # non-greedy mode
'Bold Stuff'
>>> re.findall(r'>(.+?)<', a)[0] # or this, also is non-greedy mode
'Bold Stuff'
>>> re.findall(r'>(.*)<', a)[0] # greedy mode
'Bold Stuff'
>>>

Run Code Online (Sandbox Code Playgroud)

此时,贪婪模式和非贪婪模式都可以工作.

你正在使用第一种非贪婪模式.这是一个关于非贪婪模式和贪婪模式的例子:

>>> a = '<b>Bold <br> Stuff</b>'
>>> re.findall(r'>(.*?)<', a)[0]
'Bold '
>>> re.findall(r'>(.*)<', a)[0]
'Bold <br> Stuff'
>>>

Run Code Online (Sandbox Code Playgroud)

以下是关于什么(...):

(......)

匹配括号内的正则表达式,并指示组的开始和结束;

在执行匹配后,可以检索组的内容,并且可以在字符串中稍后使用\number特殊序列进行匹配,如下所述.

要匹配的文字(或)使用\(或\),或将它们括字符类中:[(] [)].

归档时间：	10 年，7 月前
查看次数：	8138 次
最近记录：	8 年，5 月前