如何在 Python 中删除反斜杠和附加在反斜杠上的单词？

Question

如何在 Python 中删除反斜杠和附加在反斜杠上的单词？

我知道要删除单个反斜杠，我们可能会执行类似从Python 中的字符串中删除反斜杠的操作

我试图：

我想知道如何在下面的列表中删除所有像“\ue606”这样的词，

A = 
['Historical Notes 1996',
'\ue606',
'The Future of farms 2012',
'\ch889',
'\8uuuu',]

Run Code Online (Sandbox Code Playgroud)

将其转化为

['Historical Notes 1996',
'The Future of farms 2012',]

Run Code Online (Sandbox Code Playgroud)

我试过：

A = ['Historical Notes 1996',
'\ue606',
'The Future of farms 2012',
'\ch889',
'\8uuuu',]

for y in A:
      y.replace("\\", "")
A

Run Code Online (Sandbox Code Playgroud)

它返回：

['Historical Notes 1996',
 '\ue606',
 'The Future of farms 2012',
 '\\ch889',
 '\\8uuuu']

Run Code Online (Sandbox Code Playgroud)

我不确定如何处理 '\' 后面的字符串，或者为什么它添加了一个新的 '\' 而不是删除它。

Answer 1

mcs*_*ini 5

很难说服 Python 忽略 unicode 字符。这是一个有点hacky的尝试：

l = ['Historical Notes 1996',
'\ue606',
'The Future of farms 2012',
'\ch889',
'\8uuuu',]


def not_unicode_or_backslash(x):
    try:
        x = x.encode('unicode-escape').decode()
    finally:
        return not x.startswith("\\")
        

[x for x in l if not_unicode_or_backslash(x)]

# Output: ['Historical Notes 1996', 'The Future of farms 2012']

Run Code Online (Sandbox Code Playgroud)

问题是您无法直接检查字符串是否以反斜杠开头，因为\ue606它不被视为 6 个字符的字符串，而是作为单个 unicode 字符。因此，它不以反斜杠开头，对于

[x for x in l if not x.startswith("\\")]

Run Code Online (Sandbox Code Playgroud)

你得到

['Historical Notes 1996', '\ue606', 'The Future of farms 2012']

Run Code Online (Sandbox Code Playgroud)

@KatieMelosto 根据定义，Python 3 字符串*始终*是 Unicode。 (2认同)

归档时间：	4 年，5 月前
查看次数：	81 次
最近记录：	4 年，5 月前