是否可以将函数对象作为元素

jf3*_*328 6 r data.table

简答:是的,看到接受的回复.


我有以下两个data.table.

stocks = data.table(Ticker = c('xx','xx','yy','yy'), Date = c(as.IDate("2000-01-01"), as.IDate("2000-01-02")), t = c(1.8, 3.5))
   Ticker       Date   t
1:     xx 2000-01-01 1.8
2:     xx 2000-01-02 3.5
3:     yy 2000-01-01 1.8
4:     yy 2000-01-02 3.5
tt = data.table(Date = c(as.IDate("2000-01-01"), as.IDate("2000-01-02")), t0 = c(1,2), t1 = c(2,3), t2 = c(3,4), y0 = c(10, 20), y1 = c(-20, -30), y2 = c(33,44))
         Date t0 t1 t2 y0  y1 y2
1: 2000-01-01  1  2  3 10 -20 33
2: 2000-01-02  2  3  4 20 -30 44
Run Code Online (Sandbox Code Playgroud)

对于每一行stocks,我想找到y给定的近似值t,基于值的线性插值tt.

zz = tt[stocks, on = 'Date']
zz[, y.approx := approx(c(t0,t1,t2), c(y0,y1,y2), t)$y, by = 'Date,Ticker']
         Date t0 t1 t2 y0  y1 y2 Ticker   t y.approx
1: 2000-01-01  1  2  3 10 -20 33     xx 1.8      -14
2: 2000-01-02  2  3  4 20 -30 44     xx 3.5        7
3: 2000-01-01  1  2  3 10 -20 33     yy 1.8      -14
4: 2000-01-02  2  3  4 20 -30 44     yy 3.5        7
Run Code Online (Sandbox Code Playgroud)

问题是这样做有很多重复的计算.理想情况下,我想approxfun为每一天定义一个并将其应用于每一行stocks.但是datatable不能将函数对象作为其元素.

tt[, ff := approxfun(c(t0,t1,t2), c(y0,y1,y2)), by = Date]
Error in `[.data.table`(tt, , `:=`(ff, approxfun(c(t0, t1, t2), c(y0,  : 
  j evaluates to type 'closure'. Must evaluate to atomic vector or list.
Run Code Online (Sandbox Code Playgroud)

我的问题是:

  1. 有没有比approx在每一行做更好的方法(而且速度慢)?
  2. 数据表是否可以将函数对象作为其元素?

edd*_*ddi 6

将函数存储起来非常简单data.table- 只需将它们放在一个列表中:

tt[, ff := .(list(approxfun(c(t0,t1,t2), c(y0,y1,y2)))), by = Date]
#         Date t0 t1 t2 y0  y1 y2         ff
#1: 2000-01-01  1  2  3 10 -20 33 <function>
#2: 2000-01-02  2  3  4 20 -30 44 <function>

stocks[tt, y.approx := ff[[1]](t), on = 'Date', by = .EACHI]
stocks
#   Ticker       Date   t y.approx
#1:     xx 2000-01-01 1.8      -14
#2:     xx 2000-01-02 3.5        7
#3:     yy 2000-01-01 1.8      -14
#4:     yy 2000-01-02 3.5        7
Run Code Online (Sandbox Code Playgroud)


Mat*_*wle 5

怎么样的:

> zz
         Date t0 t1 t2 y0  y1 y2 Ticker   t
1: 2000-01-01  1  2  3 10 -20 33     xx 1.8
2: 2000-01-02  2  3  4 20 -30 44     xx 3.5
3: 2000-01-01  1  2  3 10 -20 33     yy 1.8
4: 2000-01-02  2  3  4 20 -30 44     yy 3.5

> zz[t0<=t & t<=t1, y.approx:={a=(t-t0)/(t1-t0); y0+a*(y1-y0)}]
> zz
         Date t0 t1 t2 y0  y1 y2 Ticker   t y.approx
1: 2000-01-01  1  2  3 10 -20 33     xx 1.8      -14
2: 2000-01-02  2  3  4 20 -30 44     xx 3.5       NA
3: 2000-01-01  1  2  3 10 -20 33     yy 1.8      -14
4: 2000-01-02  2  3  4 20 -30 44     yy 3.5       NA

> zz[t1<=t & t<=t2, y.approx:={a=(t-t1)/(t2-t1); y1+a*(y2-y1)}]
> zz
         Date t0 t1 t2 y0  y1 y2 Ticker   t y.approx
1: 2000-01-01  1  2  3 10 -20 33     xx 1.8      -14
2: 2000-01-02  2  3  4 20 -30 44     xx 3.5        7
3: 2000-01-01  1  2  3 10 -20 33     yy 1.8      -14
4: 2000-01-02  2  3  4 20 -30 44     yy 3.5        7
> 
Run Code Online (Sandbox Code Playgroud)

不知道你需要它有多普遍(你真正有多少列).但是值得尝试像这样进行向量化以逐行保存函数调用.for循环次数增量的循环的几次迭代(在这种情况下为2)应该比按行循环更快(让我们知道你是否采用这种方式并且需要为每个时间增量动态生成查询).