Python不断返回一个字符损坏的字符串.
蟒蛇
test = re.sub('handle(.*?)', '<verse osisID="lol">\1</verse>', 'handle a bunch of random text here.')
print test
Run Code Online (Sandbox Code Playgroud)
我想要的是
<verse osisID="lol">a bunch of random text here.</verse>
Run Code Online (Sandbox Code Playgroud)
我得到了什么
<verse osisID="lol">*broken character*</verse>a bunch of random text here.
Run Code Online (Sandbox Code Playgroud)
您应该转义\字符或使用r''原始字符串:
>>> re.sub('handle(.*?)', r'<verse osisID="lol">\1</verse>', 'handle a bunch of random text here.')
'<verse osisID="lol"></verse> a bunch of random text here.'
Run Code Online (Sandbox Code Playgroud)
如果没有r''原始字符串文字,则反斜杠将被解释为转义码.你也可以加倍反斜杠:
>>> '\1'
'\x01'
>>> '\\1'
'\\1'
>>> r'\1'
'\\1'
>>> print r'\1'
\1
Run Code Online (Sandbox Code Playgroud)
请注意,您只替换那里的单词handle,.*?模式至少匹配0个字符.删除问号,它将符合您的预期输出:
>>> re.sub('handle(.*)', r'<verse osisID="lol">\1</verse>', 'handle a bunch of random text here.')
'<verse osisID="lol"> a bunch of random text here.</verse>'
Run Code Online (Sandbox Code Playgroud)