基于 R 中的日期列使用连续日期扩展数据框

lg9*_*929 3 expand r date na

我想根据“日期”列扩展数据框,以便在当前日期之间按时间顺序出现新的日期行。我的“日期”列是按时间顺序排列的,跨度为 5 年,并且包含我想忽略的重复日期。我希望新行的相应 Group 和 Draw 行为“NA”。

zz <- "Date Group Draw
1  2006-05-11    bb     T
2  2006-05-11    bb     F
3  2006-05-14    aa     T
4  2006-05-16    aa     T
5  2006-05-20    cc     F
6  2006-05-20    bb     F
7  2006-05-21    aa     T"

Data <- read.table(text=zz, header = TRUE)
Run Code Online (Sandbox Code Playgroud)

所以我希望我的新数据框看起来像这样:

xx <- "Date Group Draw
1  2006-05-11    bb     T
2  2006-05-11    bb     F
3  2006-05-12    NA     NA
4  2006-05-13    NA     NA
5  2006-05-14    aa     T
6  2006-05-15    NA     NA
7  2006-05-16    aa     T
8  2006-05-17    NA     NA
9  2006-05-18    NA     NA
10 2006-05-19    NA     NA
11 2006-05-20    cc     F
12 2006-05-20    bb     F
13 2006-05-21    aa     T"

Output <- read.table(text=xx, header = TRUE)
Run Code Online (Sandbox Code Playgroud)

任何帮助将非常感激。我是 R 新手,一直在尝试手动执行此操作。

nru*_*ell 6

我认为这应该可以正常工作:

merge(
    x = data.frame(
        Date = seq.Date(min(df$Date), max(df$Date), by = "day")
    ),
    y = df,
    all.x = TRUE
)
#          Date Group  Draw
# 1  2006-05-11    bb  TRUE
# 2  2006-05-11    bb FALSE
# 3  2006-05-12  <NA>    NA
# 4  2006-05-13  <NA>    NA
# 5  2006-05-14    aa  TRUE
# 6  2006-05-15  <NA>    NA
# 7  2006-05-16    aa  TRUE
# 8  2006-05-17  <NA>    NA
# 9  2006-05-18  <NA>    NA
# 10 2006-05-19  <NA>    NA
# 11 2006-05-20    cc FALSE
# 12 2006-05-20    bb FALSE
# 13 2006-05-21    aa  TRUE
Run Code Online (Sandbox Code Playgroud)

所有这一切都是创建一个跨越实际数据范围的日期序列,然后执行左连接。


同样的想法,使用data.table

dt[dt[,.(Date = seq.Date(min(Date), max(Date), by = "day"))], on = .(Date)]
#           Date Group  Draw
#  1: 2006-05-11    bb  TRUE
#  2: 2006-05-11    bb FALSE
#  3: 2006-05-12    NA    NA
#  4: 2006-05-13    NA    NA
#  5: 2006-05-14    aa  TRUE
#  6: 2006-05-15    NA    NA
#  7: 2006-05-16    aa  TRUE
#  8: 2006-05-17    NA    NA
#  9: 2006-05-18    NA    NA
# 10: 2006-05-19    NA    NA
# 11: 2006-05-20    cc FALSE
# 12: 2006-05-20    bb FALSE
# 13: 2006-05-21    aa  TRUE
Run Code Online (Sandbox Code Playgroud)
zz <- "Date Group Draw
1  2006-05-11    bb     T
2  2006-05-11    bb     F
3  2006-05-14    aa     T
4  2006-05-16    aa     T
5  2006-05-20    cc     F
6  2006-05-20    bb     F
7  2006-05-21    aa     T"

df <- read.table(
    text = zz, 
    header = TRUE
)
df$Date <- as.Date(df$Date) 

library(data.table)
dt <- data.table(read.table(text = zz, header = TRUE))[,Date := as.Date(Date)]
Run Code Online (Sandbox Code Playgroud)