我如何区分R中的面板数据

Fra*_*art 2 r time-series panel-data

我想知道是否有任何简单的R命令或软件包都可以让我轻松地将变量添加到data.frames,这些是变量的"差异"或随时间的变化.

如果我的数据看起来像这样:

set.seed(1)
MyData <- data.frame(Day=0:9 %% 5+1, 
                 Price=rpois(10,10),
                 Good=rep(c("apples","oranges"), each=5))
MyData

   Day Price    Good
1    1     8  apples
2    2    10  apples
3    3     7  apples
4    4    11  apples
5    5    14  apples
6    1    12 oranges
7    2    11 oranges
8    3     9 oranges
9    4    14 oranges
10   5    11 oranges
Run Code Online (Sandbox Code Playgroud)

然后在"第一次区分"价格变量后,我的数据将如下所示.

   Day Price    Good P1d
1    1     8  apples  NA
2    2    10  apples   2
3    3     7  apples  -3
4    4    11  apples   4
5    5    14  apples   3
6    1    12 oranges  NA
7    2    11 oranges  -1
8    3     9 oranges  -2
9    4    14 oranges   5
10   5    11 oranges  -3
Run Code Online (Sandbox Code Playgroud)

G. *_*eck 6

AVE

transform(MyData, P1d = ave(Price, Good, FUN = function(x) c(NA, diff(x))))
Run Code Online (Sandbox Code Playgroud)

AVE/gsubfn

使用fn$gsubfn包可以稍微缩短最后一个解决方案:

library(gsubfn)
transform(MyData, P1d = fn$ave(Price, Good, FUN = ~ c(NA, diff(x))))
Run Code Online (Sandbox Code Playgroud)

dplyr

library(dplyr)

MyData %>% 
  group_by(Good) %>% 
  mutate(P1d = Price - lag(Price)) %>% 
  ungroup
Run Code Online (Sandbox Code Playgroud)

data.table

library(data.table)

dt <- data.table(MyData)
dt[, P1d := c(NA, diff(Price)), by = Good]
Run Code Online (Sandbox Code Playgroud)

更新

dplyr现在使用%>%而不是%.%.