Key*_*son 4 python file-io stripping
我目前正在尝试输入一个文本文件,将每个单词分开并将它们组织成一个列表.
我当前遇到的问题是从文本文件中删除逗号和句点.
我的代码如下:
#Process a '*.txt' file.
def Process():
name = input("What is the name of the file you would like to read from? ")
file = open( name , "r" )
text = [word for line in file for word in line.lower().split()]
word = word.replace(",", "")
word = word.replace(".", "")
print(text)
Run Code Online (Sandbox Code Playgroud)
我目前得到的输出是这样的:
['this', 'is', 'the', 'first', 'line', 'of', 'the', 'file.', 'this', 'is', 'the', 'second', 'line.']
Run Code Online (Sandbox Code Playgroud)
正如您所看到的,单词"file"和"line"在它们的末尾有一个句点.
我正在阅读的文本文件是:
这是该文件的第一行.
这是第二行.
提前致谢.
这些行无效
word = word.replace(",", "")
word = word.replace(".", "")
Run Code Online (Sandbox Code Playgroud)
只需将列表组件更改为:
[word.replace(",", "").replace(".", "")
for line in file for word in line.lower().split()]
Run Code Online (Sandbox Code Playgroud)
也许strip
比replace
def Process():
name = input("What is the name of the file you would like to read from? ")
file = open(name , "r")
text = [word.strip(",.") for line in file for word in line.lower().split()]
print(text)
Run Code Online (Sandbox Code Playgroud)
>>>帮助(str.strip) 关于method_descriptor的帮助: 跳闸(...) S.strip([chars])->字符串或Unicode 返回字符串S的开头和结尾的副本 空格已删除。 如果给定chars而不是None,则改为删除chars中的字符。 如果chars是unicode,则S在剥离之前将转换为unicode