Gop*_*lem 8 datetime r data.table
我正在data.table
从包含日期,订单,金额等字段的CSV文件中加载.
输入文件偶尔没有所有日期的数据.例如,如下所示:
> NADayWiseOrders
date orders amount guests
1: 2013-01-01 50 2272.55 149
2: 2013-01-02 3 64.04 4
3: 2013-01-04 1 18.81 0
4: 2013-01-05 2 77.62 0
5: 2013-01-07 2 35.82 2
Run Code Online (Sandbox Code Playgroud)
在上面的03年1月和6月6日没有任何条目.
想要用缺省值填充缺失的条目(例如,订单为零,金额等),或者最后一个vaue(例如,03-Jan将重用02-Jan值,06-Jan将重用05-Jan价值观等.)
使用此类默认值填充缺失日期数据缺口的最佳/最佳方法是什么?
这里的答案建议使用allow.cartesian = TRUE
,并且expand.grid
对于缺少工作日 - 它可能适用于工作日(因为它们只是7个工作日) - 但不确定这是否也是正确的约会方式,特别是如果我们处理多个年度数据.
Aru*_*run 10
惯用的data.table
方式(使用滚动连接)是这样的:
setkey(NADayWiseOrders, date)
all_dates <- seq(from = as.Date("2013-01-01"),
to = as.Date("2013-01-07"),
by = "days")
NADayWiseOrders[J(all_dates), roll=Inf]
date orders amount guests
1: 2013-01-01 50 2272.55 149
2: 2013-01-02 3 64.04 4
3: 2013-01-03 3 64.04 4
4: 2013-01-04 1 18.81 0
5: 2013-01-05 2 77.62 0
6: 2013-01-06 2 77.62 0
7: 2013-01-07 2 35.82 2
Run Code Online (Sandbox Code Playgroud)
不确定它是否是最快的,但如果NA
数据中没有s,它将起作用:
# just in case these aren't Dates.
NADayWiseOrders$date <- as.Date(NADayWiseOrders$date)
# all desired dates.
alldates <- data.table(date=seq.Date(min(NADayWiseOrders$date), max(NADayWiseOrders$date), by="day"))
# merge
dt <- merge(NADayWiseOrders, alldates, by="date", all=TRUE)
# now carry forward last observation (alternatively, set NA's to 0)
require(xts)
na.locf(dt)
Run Code Online (Sandbox Code Playgroud)
以下是您如何填补小组内的空白
# a toy dataset with gaps in the time series
dt <- as.data.table(read.csv(textConnection('"group","date","x"
"a","2017-01-01",1
"a","2017-02-01",2
"a","2017-05-01",3
"b","2017-02-01",4
"b","2017-04-01",5')))
dt[,date := as.Date(date)]
# the desired dates by group
indx <- dt[,.(date=seq(min(date),max(date),"months")),group]
# key the tables and join them using a rolling join
setkey(dt,group,date)
setkey(indx,group,date)
dt[indx,roll=TRUE]
#> group date x
#> 1: a 2017-01-01 1
#> 2: a 2017-02-01 2
#> 3: a 2017-03-01 2
#> 4: a 2017-04-01 2
#> 5: a 2017-05-01 3
#> 6: b 2017-02-01 4
#> 7: b 2017-03-01 4
#> 8: b 2017-04-01 5
Run Code Online (Sandbox Code Playgroud)