Jay*_*sby 3 python regex string
我有一个文本文件,其中包含如下行:
<pattern number=1 theme=pseudo>
<pattern number=2 theme=foo>
<pattern number=3 theme=bar>
Run Code Online (Sandbox Code Playgroud)
我正在使用此函数选择一条随机线:
def find_random_pattern(theme):
found_lines = []
pattern = open('poempatterns.txt','r')
for line in pattern:
found = re.findall("theme="+theme,line)
for match in found:
found_lines.append(line)
selectedline = random.choice(found_lines)
return selectedline
Run Code Online (Sandbox Code Playgroud)
假设它返回了<pattern number=1 theme=pseudo>
当我用这个条件检查它时,它返回False
if find_random_pattern("pseudo") == "<pattern number=1 theme=pseudo>":
return True
else:
return False
Run Code Online (Sandbox Code Playgroud)
为什么这两个字符串不匹配?
您期望re.findall返回整行,但它返回匹配的部分:
>>> line = "<pattern number=1 theme=pseudo>"
>>> import re
>>> re.findall("theme=pseudo", line)
['theme=pseudo']
Run Code Online (Sandbox Code Playgroud)
我建议您使用与整行匹配的模式,例如:
>>> re.findall(".*theme=pseudo.*", line)
['<pattern number=1 theme=pseudo>']
Run Code Online (Sandbox Code Playgroud)
您的最终代码将如下所示:
def find_random_pattern(theme):
found_lines = []
pattern = open('poempatterns.txt','r')
for line in pattern:
# Call strip() just in case you have some blank spaces or \n at the end
found = re.findall(".*theme=%s.*" % theme, line.strip())
for match in found:
found_lines.append(line)
selectedline = random.choice(found_lines)
return selectedline
Run Code Online (Sandbox Code Playgroud)
更简洁的解决方案是:
import random
import re
import string
def find_random_pattern(theme):
lines = open('poempatterns.txt','r').readlines()
stripped_lines = map(string.strip, lines)
found_lines = filter(lambda l: re.match(".*theme=%s.*" % theme, l), stripped_lines)
return random.choice(found_lines)
Run Code Online (Sandbox Code Playgroud)