我需要清理一些文本,如下面的代码所示:
import re
def clean_text(text):
text = text.lower()
#foction de replacement
text = re.sub(r"i'm","i am",text)
text = re.sub(r"she's","she is",text)
text = re.sub(r"can't","cannot",text)
text = re.sub(r"[-()\"#/@;:<>{}-=~|.?,]","",text)
return text
clean_questions= []
for question in questions:
clean_questions.append(clean_text(question))
Run Code Online (Sandbox Code Playgroud)
并且这段代码必须给我一个questions干净的列表,但我得到了干净的questions空。我重新打开了 spyder,列表已满,但没有被清理,然后重新打开它,我把它清空了.. 控制台错误说:
In [10] :clean_questions= []
...: for question in questions:
...: clean_questions.append(clean_text(question))
Traceback (most recent call last):
File "<ipython-input-6-d1c7ac95a43f>", line 3, in <module>
clean_questions.append(clean_text(question))
File "<ipython-input-5-8f5da8f003ac>", line 16, in clean_text
text = re.sub(r"[-()\"#/@;:<>{}-=~|.?,]","",text)
File "C:\Users\hp\Anaconda3\lib\re.py", line 192, in sub
return …Run Code Online (Sandbox Code Playgroud)