我有一个.text格式如下的文件:
712 ben Battle of the Books
713 james i used to be in TOM
714 tomy i was in BOB once
715 ben Tournaments of Minds
716 tommy Also the Lion in the upcoming school play
717 tommy Can you guess
718 tommy P
...
Run Code Online (Sandbox Code Playgroud)
索引号,名称和消息分开\t.我read_csv用来读取文件并将其存储为数据框:
chat = pd.read_csv("f.text", sep = "\t", header = None, usecols = [2])
Run Code Online (Sandbox Code Playgroud)
但是数据框只有9812行,而普通文件有多12428行(只有21行).这很奇怪.你有什么主意吗?谢谢.
我认为你需要添加参数quoting:
import csv
chat = pd.read_csv("f.text",sep = "\t", header = None, usecols = [2], quoting=csv.QUOTE_NONE)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1527 次 |
| 最近记录: |