情况
我有一个df包含两个变量的数据框,ReportYear和Salary.
dput(df)
structure(list(ReportYear = structure(c(2012, 2012, 2012, 2012,
2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012,
2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012,
2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012,
2012, 2012, 2012, 2012, 2012), class = c("summaryDefault", "table"
)), Salary = structure(c(198000, 495500, 745000, 1417000, 1662000,
5483000, 260100, 460000, 697000, 1595000, 2160000, 5778000, 331000,
790000, 1260000, 1736000, 1670000, 9310000, 270000, 459500, 602000,
1355000, 984200, 6191000, 290000, 463200, 564500, 1420000, 779500,
6779000, 650300, 1448000, 2076000, 2907000, 3894000, 6938000,
157000, 404800, 481000, 1074000, 1199000, 4603000), class = c("summaryDefault",
"table"))), row.names = c(NA, -42L), class = "data.frame", .Names = c("ReportYear",
"Salary"))
Run Code Online (Sandbox Code Playgroud)
我正在尝试filter数据,但得到一个错误:
library(dplyr)
df <- filter(df, Salary > 10)
Error: column 'ReportYear' has unsupported type
Run Code Online (Sandbox Code Playgroud)
题
有谁知道为什么我ReportYear的错误类型?是否与"列表"结构有关,如果是这样,我如何解决它以便我可以filter获取数据?
其他说明
> str(df)
'data.frame': 42 obs. of 2 variables:
$ ReportYear:Classes 'summaryDefault', 'table' num [1:42] 2012 2012 2012 2012 2012 ...
$ Salary :Classes 'summaryDefault', 'table' num [1:42] 198000 495500 745000 1417000 1662000 ...
>
Run Code Online (Sandbox Code Playgroud)
数据是由...生成的summary.
它似乎是错误的类型,因为列来自summary.default.请参阅帮助文件的" 值"部分. summary()
默认方法返回类c的对象("summaryDefault","table"),它具有专门的打印方法.
首先要知道如何创建数据,但是你可以删除这些类,unclass然后你的代码就可以了.
df[] <- lapply(df, unclass)
filter(df, Salary > 10)
Run Code Online (Sandbox Code Playgroud)
我不确定这是否是标准的预期行为.
| 归档时间: |
|
| 查看次数: |
12045 次 |
| 最近记录: |