函数"diff"在R中的各个组

Question

函数"diff"在R中的各个组

我有一个数据框,有2组1时间变量和一个因变量.例如:

name <- c("a", "a", "a", "a", "a", "a","a", "a", "a", "b", "b", "b","b", "b", "b","b", "b", "b")
class <- c("c1", "c1", "c1", "c2", "c2", "c2", "c3", "c3", "c3","c1", "c1", "c1", "c2", "c2", "c2", "c3", "c3", "c3")
year <- c("2010", "2009", "2008", "2010", "2009", "2008", "2010", "2009", "2008", "2010", "2009", "2008", "2010", "2009", "2008", "2010", "2009", "2008")
value <- c(100, 33, 80, 90, 80, 100, 100, 90, 80, 90, 80, 100, 100, 90, 80, 99, 80, 100)

df <- data.frame(name, class, year, value)
df

Run Code Online (Sandbox Code Playgroud)

并希望在"class"和"name"的每个组合中应用"diff"函数.

我想要的输出应该是这样的:

      name class year value.1
    1    a    c1   2010  -67      
    2    a    c1   2009   47
    3    b    c1   2010  -10
    4    b    c1   2009   20
    ...

Run Code Online (Sandbox Code Playgroud)

我试过了

aggregate(value~name + class, data=df, FUN="diff")

Run Code Online (Sandbox Code Playgroud)

这不会产生我在大型数据集中寻找的解决方案.非常感谢你提前!

Sebatian

Answer 1

And*_*rie 5

该plyr软件包将是你的朋友.该函数ddply采用a data.frame,为每个定义的子集应用函数,然后返回data.frame所有重组的一个.

最简单的解决方案是使用summarize和diff(value)为每个组合.(class, name):

library(plyr)
ddply(df, .(class, name), summarize, diff(value))

   class name ..1
1     c1    a -67
2     c1    a  47
3     c1    b -10
4     c1    b  20
5     c2    a -10
6     c2    a  20
7     c2    b -10
8     c2    b -10
9     c3    a -10
10    c3    a -10
11    c3    b -19
12    c3    b  20

Run Code Online (Sandbox Code Playgroud)

为了在结果中获得多年,它需要更多参与:

ddply(df, .(class, name), summarize, year=head(year, -1), value=diff(value))
   class name year value
1     c1    a 2010   -67
2     c1    a 2009    47
3     c1    b 2010   -10
4     c1    b 2009    20
5     c2    a 2010   -10
6     c2    a 2009    20
7     c2    b 2010   -10
8     c2    b 2009   -10
9     c3    a 2010   -10
10    c3    a 2009   -10
11    c3    b 2010   -19
12    c3    b 2009    20

Run Code Online (Sandbox Code Playgroud)

归档时间：	14 年，2 月前
查看次数：	1955 次
最近记录：	14 年，1 月前