Yen*_*ici 5 r fread data.table
Windows 8.1,R版本3.1.1(2014-07-10),系统x86_64,mingw32
我有一个包含大量观察的文件(这里).这是文件中的一些字符串
Date;Time;Global_active_power;Global_reactive_power;Voltage;Global_intensity;Sub_metering_1;Sub_metering_2;Sub_metering_3
16/12/2006;17:24:00;4.216;0.418;234.840;18.400;0.000;1.000;17.000
16/12/2006;17:25:00;5.360;0.436;233.630;23.000;0.000;1.000;16.000
28/4/2007;00:20:00;0.492;0.208;236.240;2.200;0.000;0.000;0.000
28/4/2007;00:21:00;?;?;?;?;?;?;
21/12/2006;11:25:00;0.246;0.000;241.740;1.000;0.000;0.000;0.000
21/12/2006;11:26:00;0.246;0.000;241.830;1.000;0.000;0.000;0.000
Run Code Online (Sandbox Code Playgroud)
NA值用"?"表示.我正在尝试阅读该文件
epcData <- fread(dataFile,
sep = ";",
header = TRUE,
na.strings = "?",
colClasses = c("character", "character", rep("numeric", 7)),
stringsAsFactors = FALSE)
Run Code Online (Sandbox Code Playgroud)
我有警告像:
Bumped column 3 to type character on data row 10, field contains '?'. Coercing previously read values in this column from integer or numeric back to character which may not be lossless; e.g., if '00' and '000' occurred before they will now be just '0', and there may be inconsistencies with treatment of ',,' and ',NA,' too (if they occurred in this column before the bump). If this matters please rerun and set 'colClasses' to 'character' for this column. Please note that column type detection uses the first 5 rows, the middle 5 rows and the last 5 rows, so hopefully this message should be very rare. If reporting to datatable-help, please rerun and include the output from verbose=TRUE.
Run Code Online (Sandbox Code Playgroud)
第10行是
28/4/2007;00:21:00;?;?;?;?;?;?;
Run Code Online (Sandbox Code Playgroud)
epcData [10]
版画
Date Time Global_active_power Global_reactive_power Voltage
1: 28/4/2076 00:21:00 NA NA NA
Global_intensity Sub_metering_1 Sub_metering_2 Sub_metering_3
1: NA NA NA NA
Run Code Online (Sandbox Code Playgroud)
但是所有列的模式都是"字符",即使对于列3:9(但是colClasses = c("character","character",rep("numeric",7))).
出了什么问题?