在保留组的同时填充数字变量

bos*_*hek 5 r dataframe dplyr tidyverse

[编辑反映更好的例子]

假设我有一个这样的数据帧:

df <- data.frame(x = c("A","A","B", "B"), year = c(2001,2004,2002,2005))

> df
  x year
1 A 2001
2 A 2004
3 B 2002
4 B 2005
Run Code Online (Sandbox Code Playgroud)

如何year在保留的同时增加1 x?我想填写,year以便序列是这样的:

  x year
1 A 2001
2 A 2002
3 A 2003
4 A 2004
5 B 2002
6 B 2003
7 B 2004
8 B 2005
Run Code Online (Sandbox Code Playgroud)

任何人都可以推荐这样做的好方法吗?

@useR推荐这种方法:

> data.frame(year = min(df$year):max(df$year)) %>%
   full_join(df) %>%
   fill(x) 
Joining, by = "year"
  year x
1 2001 A
2 2002 B
3 2003 B
4 2004 A
5 2005 B
Run Code Online (Sandbox Code Playgroud)

但是,这与所需的输出不匹配.

MKR*_*MKR 4

tidyr::complete使用and 的选项dplyr::lead可以是:

library(tidyverse)

df <- data.frame(x = LETTERS[1:3], year = c(2001,2004,2007))  

df %>% mutate(nextYear = ifelse(is.na(lead(year)),year, lead(year)-1)) %>%
  group_by(x) %>%
  complete(year = seq(year, nextYear, by=1)) %>% 
  select(-nextYear) %>%
  as.data.frame()

#   x year
# 1 A 2001
# 2 A 2002
# 3 A 2003
# 4 B 2004
# 5 B 2005
# 6 B 2006
# 7 C 2007
Run Code Online (Sandbox Code Playgroud)

编辑:修改数据的解决方案

df <- data.frame(x = c("A","A","B", "B"), year = c(2001,2004,2002,2005))
library(tidyverse)
df %>%  group_by(x) %>%
  complete(year = seq(min(year), max(year), by=1)) %>% 
  as.data.frame()


#   x year
# 1 A 2001
# 2 A 2002
# 3 A 2003
# 4 A 2004
# 5 B 2002
# 6 B 2003
# 7 B 2004
# 8 B 2005
Run Code Online (Sandbox Code Playgroud)