R - Dplyr - 比较上一行与实际行的值

Oma*_*les 4 r dplyr

我有这个数据框:

    year    month    UserID
1   2014    11        3527
2   2014    12        4916
3   2015    1         2445
Run Code Online (Sandbox Code Playgroud)

并且想要添加"变体"列:公式为:ActualRow/LastRow - 1.

这是我的代码:

UserID_unicos2 <- UserID_unicos1 %>%
                  mutate(variation=(UserID/lag(UserID) - 1)) %>% 
                  mutate(prev=lag(UserID))
Run Code Online (Sandbox Code Playgroud)

但是,它只会返回:

    year    month   UserID  variation   prev
1   2014     11      3527      NA        NA
2   2014     12      4916   0.3938191   3527
3   2015      1      2445      NA        NA
Run Code Online (Sandbox Code Playgroud)

如您所见,它只返回2014-12的值.而不是:2015-01.怎么会?谢谢.

应用"dput()"后的数据:

structure(list(year = c(2014L, 2014L, 2015L), month = c(11L, 
12L, 1L), UserID = c(3527L, 4916L, 2445L)), .Names = c("year", 
"month", "UserID"), row.names = c(NA, -3L), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), vars = list(year), drop = TRUE, indices = list(
    0:1, 2L), group_sizes = c(2L, 1L), biggest_group_size = 2L, labels = structure(list(
    year = 2014:2015), class = "data.frame", row.names = c(NA, 
-2L), .Names = "year", vars = list(year)))
Run Code Online (Sandbox Code Playgroud)

tal*_*lat 6

根据您的说法dput,您的数据按分组排列year,这就是您看到此结果的原因.试试这个:

UserID_unicos1 %>%
  ungroup() %>%
  mutate(variation=(UserID/lag(UserID) - 1),
         prev=lag(UserID))
Run Code Online (Sandbox Code Playgroud)

另请注意,您可以在mutate逗号分隔的同一列中创建两列.