我怎样才能在每一栏中获得手段？

Han*_*nah 4 aggregate r dataframe

我有一个像这样的大数据框:

ID  c_Al   c_D    c_Hy      occ
A     0     0      0        2306
B     0     0      0        3031
C     0     0      1        2581
D     0     0      1        1917
E     0     0      1        2708
F     0     1      0        2751
G     0     1      0        1522
H     0     1      0        657
I     0     1      1        469
J     0     1      1        2629
L     1     0      0        793
L     1     0      0        793
M     1     0      0        564
N     1     0      1        2617
O     1     0      1        1167
P     1     0      1        389
Q     1     0      1        294
R     1     1      0        1686
S     1     1      0        992

Run Code Online (Sandbox Code Playgroud)

我怎样才能在每一栏中获得手段？

               0        1
    c_Al    1506.2  1641.2
    c_D     748.6   1467.5
    c_Hy    1506.2  1641.2

Run Code Online (Sandbox Code Playgroud)

我试过了aggregate(occ~c_Al, mean, data=table2),但必须做很多次; ddply有相同的结果,或者for(i in 1:dim(table2)[1]){ aggregate(occ~[,i], mean, data=table2)},但它不能工作.

我会使用melt和dcast来自"reshape2":

library(reshape2)
dfL <- melt(table2, id.vars = c("ID", "occ"))
dcast(dfL, variable ~ value, value.var = "occ", fun.aggregate = mean)
#   variable        0        1
# 1     c_Al 2057.100 1032.778
# 2      c_D 1596.667 1529.429
# 3     c_Hy 1509.500 1641.222

Run Code Online (Sandbox Code Playgroud)

当然,基地R也可以处理这个问题.

在这里,我用tapply和vapply:

vapply(table2[2:4], function(x) tapply(table2$occ, x, mean), numeric(2L))
#       c_Al      c_D     c_Hy
# 0 2057.100 1596.667 1509.500
# 1 1032.778 1529.429 1641.222
t(vapply(table2[2:4], function(x) tapply(table2$occ, x, mean), numeric(2L)))
#             0        1
# c_Al 2057.100 1032.778
# c_D  1596.667 1529.429
# c_Hy 1509.500 1641.222

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年，4 月前
查看次数：	107 次
最近记录：	11 年，4 月前

关于使用roxygen2的UTF-8的警告 42

为什么strsplit使用正向前瞻和后观断言匹配不同？ 26

训练ksvm prob.model中的行搜索失败 14

使用stringdist对变量上的数据进行分区以加速"模糊匹配" 13

何时以及如何在xcode 4中使用Aggregate Target 12

R:如何使用direct.label标记特定轮廓 10

在 Tensorflow 2.2.0 中，在将数据与validation_data一起拟合后，我的 model.history.history 为空 6

R:将行旋转到列中,并使用N/A表示缺失值 5

Python pandas:添加特定列中的元素列表以查找all_elements 3

在数据帧内按行关联 3

如何强制"git pull"覆盖本地文件？ 6654

应该在JavaScript比较中使用哪个等于运算符(== vs ===)？ 5666

如何在C#中枚举枚举？ 3620

如何更改远程Git存储库的URI(URL)？ 3529

接口和抽象类之间有什么区别？ 1705

如何删除GitHub上的提交？ 1619

静态只读与const 1349

更改列:null为非null 1177

为特定提交生成git补丁 1144

我如何修剪空白？ 1035