Han*_*nah 4 aggregate r dataframe
我有一个像这样的大数据框:
ID c_Al c_D c_Hy occ
A 0 0 0 2306
B 0 0 0 3031
C 0 0 1 2581
D 0 0 1 1917
E 0 0 1 2708
F 0 1 0 2751
G 0 1 0 1522
H 0 1 0 657
I 0 1 1 469
J 0 1 1 2629
L 1 0 0 793
L 1 0 0 793
M 1 0 0 564
N 1 0 1 2617
O 1 0 1 1167
P 1 0 1 389
Q 1 0 1 294
R 1 1 0 1686
S 1 1 0 992
Run Code Online (Sandbox Code Playgroud)
我怎样才能在每一栏中获得手段?
0 1
c_Al 1506.2 1641.2
c_D 748.6 1467.5
c_Hy 1506.2 1641.2
Run Code Online (Sandbox Code Playgroud)
我试过了aggregate(occ~c_Al, mean, data=table2)
,但必须做很多次; ddply
有相同的结果,或者for(i in 1:dim(table2)[1]){ aggregate(occ~[,i], mean, data=table2)}
,但它不能工作.
A5C*_*2T1 10
我会使用melt
和dcast
来自"reshape2":
library(reshape2)
dfL <- melt(table2, id.vars = c("ID", "occ"))
dcast(dfL, variable ~ value, value.var = "occ", fun.aggregate = mean)
# variable 0 1
# 1 c_Al 2057.100 1032.778
# 2 c_D 1596.667 1529.429
# 3 c_Hy 1509.500 1641.222
Run Code Online (Sandbox Code Playgroud)
当然,基地R也可以处理这个问题.
在这里,我用tapply
和vapply
:
vapply(table2[2:4], function(x) tapply(table2$occ, x, mean), numeric(2L))
# c_Al c_D c_Hy
# 0 2057.100 1596.667 1509.500
# 1 1032.778 1529.429 1641.222
t(vapply(table2[2:4], function(x) tapply(table2$occ, x, mean), numeric(2L)))
# 0 1
# c_Al 2057.100 1032.778
# c_D 1596.667 1529.429
# c_Hy 1509.500 1641.222
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
107 次 |
最近记录: |