有没有办法可以改进,或更简单地完成?
means.by<-function(data,INDEX){
b<-by(data,INDEX,function(d)apply(d,2,mean))
return(structure(
t(matrix(unlist(b),nrow=length(b[[1]]))),
dimnames=list(names(b),col.names=names(b[[1]]))
))
}
Run Code Online (Sandbox Code Playgroud)
这个想法与SAS MEANS BY语句相同.函数'means.by'获取data.frame和索引变量,并计算对应于INDEX唯一值的每组行的data.frame列的平均值,并返回带有该行的新数据框命名INDEX的唯一值.
我确信在R中必须有更好的方法来做到这一点,但我想不出任何事情.
你想要tapply或者ave,取决于你想要的输出:
> Data <- data.frame(grp=sample(letters[1:3],20,TRUE),x=rnorm(20))
> ave(Data$x, Data$grp)
[1] -0.3258590 -0.5009832 -0.5009832 -0.2136670 -0.3258590 -0.5009832
[7] -0.3258590 -0.2136670 -0.3258590 -0.2136670 -0.3258590 -0.3258590
[13] -0.3258590 -0.5009832 -0.2136670 -0.5009832 -0.3258590 -0.2136670
[19] -0.5009832 -0.2136670
> tapply(Data$x, Data$grp, mean)
a b c
-0.5009832 -0.2136670 -0.3258590
# Example with more than one column:
> Data <- data.frame(grp=sample(letters[1:3],20,TRUE),x=rnorm(20),y=runif(20))
> do.call(rbind,lapply(split(Data[,-1], Data[,1]), mean))
x y
a -0.675195494 0.4772696
b 0.270891403 0.5091359
c 0.002756666 0.4053922
Run Code Online (Sandbox Code Playgroud)