小编olf*_*udi的帖子

使用 python 和 re 清理文本

我需要清理一些文本,如下面的代码所示:

import re
def clean_text(text):
    text = text.lower()
    #foction de replacement
    text = re.sub(r"i'm","i am",text)
    text = re.sub(r"she's","she is",text)
    text = re.sub(r"can't","cannot",text)
    text = re.sub(r"[-()\"#/@;:<>{}-=~|.?,]","",text)
    return text

clean_questions= []
for question in questions: 
    clean_questions.append(clean_text(question))
Run Code Online (Sandbox Code Playgroud)

并且这段代码必须给我一个questions干净的列表,但我得到了干净的questions空。我重新打开了 spyder,列表已满,但没有被清理,然后重新打开它,我把它清空了.. 控制台错误说:

In [10] :clean_questions= [] 
   ...: for question in questions: 
   ...: clean_questions.append(clean_text(question))
Traceback (most recent call last):

  File "<ipython-input-6-d1c7ac95a43f>", line 3, in <module>
    clean_questions.append(clean_text(question))

  File "<ipython-input-5-8f5da8f003ac>", line 16, in clean_text
    text = re.sub(r"[-()\"#/@;:<>{}-=~|.?,]","",text)

  File "C:\Users\hp\Anaconda3\lib\re.py", line 192, in sub
    return …
Run Code Online (Sandbox Code Playgroud)

python regex character-class python-3.x

1
推荐指数
1
解决办法
4790
查看次数

标签 统计

character-class ×1

python ×1

python-3.x ×1

regex ×1