我尝试加载使用大型数据文件(大约20万行)fread()从data.table包.但是,有些行会造成很大的麻烦.
最小的例子:
text.csv contains:
id, text
1,"""Oops"",\""The"",""Georgia"""
fread("text.csv", sep=",")
Error in fread("text.csv", sep = ",") :
Not positioned correctly after testing format of header row. ch=','
In addition: Warning message:
In fread("text.csv", sep = ",") :
Starting data input on line 2 and discarding line 1 because it has too few or too many items to be column names or data: id, text
Run Code Online (Sandbox Code Playgroud)
read.table() 工作得更好,但太慢,内存效率太低.
> read.table("text.csv", header = TRUE, sep=",")
id text
1 1 "Oops",\\"The","Georgia" …Run Code Online (Sandbox Code Playgroud)