使用decode()与regex来解除此字符串的转换

Question

使用decode()与regex来解除此字符串的转换

Ben*_*Ben 7 python regex string decode escaping

我有以下字符串,我正在试图找出解决它的最佳做法.

解决方案必须有点灵活,因为我从API接收此输入,并且我不能完全确定当前的字符结构(\n相对于\r)将始终是相同的.

'"If it ain\'t broke, don\'t fix it." \nWent in for a detailed car wash.\nThe attendants raved-up my engine when taking the car into the tunnel. NOTE: my car is...'

这个正则表达式似乎应该工作:

text_excerpt = re.sub(r'[\s"\\]', ' ', raw_text_excerpt).strip()

Run Code Online (Sandbox Code Playgroud)

我已经阅读过decode()可能有用的内容(并且通常会成为更好的解决方案).

raw_text_excerpt.decode('string_unescape')

Run Code Online (Sandbox Code Playgroud)

尝试了这些方面的东西,它没有奏效.有什么建议？正则表达式在这里最好吗？

Answer 1

jav*_*ard 16

您正在寻找的编解码器是string-escape:

>>> print "\\'".decode("string-escape")
'

Run Code Online (Sandbox Code Playgroud)

我不确定他们添加了什么版本,但是...可能是你正在使用的旧版本没有它.我在跑:

Python 2.6.6 (r266:84292, Mar 25 2011, 19:36:32) 
[GCC 4.5.2] on linux2

Run Code Online (Sandbox Code Playgroud)

归档时间：	13 年，10 月前
查看次数：	7937 次
最近记录：	12 年，2 月前