我对正则表达式感到不舒服,所以我需要你的帮助,这对我来说似乎很棘手.
假设我有以下字符串:
string = 'keyword1 keyword2 title:hello title:world "title:quoted" keyword3'
Run Code Online (Sandbox Code Playgroud)
什么是正则表达式title:hello,title:world从原始字符串中删除这些字符串并留"title:quoted"在其中,因为它被双引号括起来?
我已经看到了这个类似的答案,这就是我最终的结果:
import re
string = 'keyword1 keyword2 title:hello title:world "title:quoted" keyword3'
def replace(m):
if m.group(1) is None:
return m.group()
return m.group().replace(m.group(1), "")
regex = r'\"[^\"]title:[^\s]+\"|([^\"]*)'
cleaned_string = re.sub(regex, replace, string)
assert cleaned_string == 'keyword1 keyword2 "title:quoted" keyword3'
Run Code Online (Sandbox Code Playgroud)
当然,它不起作用,我并不感到惊讶,因为正则表达式对我来说是深奥的.
谢谢您的帮助 !
感谢您的回答,这是最终解决方案,满足我的需求:
import re
matches = []
def replace(m):
matches.append(m.group())
return ""
string = 'keyword1 keyword2 title:hello title:world "title:quoted" keyword3'
regex …Run Code Online (Sandbox Code Playgroud)