使用不同方式对具有数字索引的data.table列进行子集化时的结果不同

mt1*_*022 5 r data.table

看最小的例子:

library(data.table)
DT <- data.table(x = 2, y = 3, z = 4)

DT[, c(1:2)]  # first way
#    x y
# 1: 2 3

DT[, (1:2)]  # second way
# [1] 1 2

DT[, 1:2]  # third way
#    x y
# 1: 2 3
Run Code Online (Sandbox Code Playgroud)

在本描述,用数字索引子集划分列现在是可能的.但是,我想知道为什么索引以第二种方式而不是列索引来评估向量?

另外,我data.table刚刚更新:

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.11.2

loaded via a namespace (and not attached):
[1] compiler_3.4.4 tools_3.4.4    yaml_2.1.17
Run Code Online (Sandbox Code Playgroud)

Dav*_*urg 5

通过查看源代码,我们可以模拟不同输入的data.tables行为

if (!missing(j)) {
    jsub = replace_dot_alias(substitute(j))
    root = if (is.call(jsub)) as.character(jsub[[1L]])[1L] else ""
    if (root == ":" ||
        (root %chin% c("-","!") && is.call(jsub[[2L]]) && jsub[[2L]][[1L]]=="(" && is.call(jsub[[2L]][[2L]]) && jsub[[2L]][[2L]][[1L]]==":") ||
        ( (!length(av<-all.vars(jsub)) || all(substring(av,1L,2L)=="..")) &&
          root %chin% c("","c","paste","paste0","-","!") &&
          missing(by) )) {   # test 763. TODO: likely that !missing(by) iff with==TRUE (so, with can be removed)
      # When no variable names (i.e. symbols) occur in j, scope doesn't matter because there are no symbols to find.
      # If variable names do occur, but they are all prefixed with .., then that means look up in calling scope.
      # Automatically set with=FALSE in this case so that DT[,1], DT[,2:3], DT[,"someCol"] and DT[,c("colB","colD")]
      # work as expected.  As before, a vector will never be returned, but a single column data.table
      # for type consistency with >1 cases. To return a single vector use DT[["someCol"]] or DT[[3]].
      # The root==":" is to allow DT[,colC:colH] even though that contains two variable names.
      # root == "-" or "!" is for tests 1504.11 and 1504.13 (a : with a ! or - modifier root)
      # We don't want to evaluate j at all in making this decision because i) evaluating could itself
      # increment some variable and not intended to be evaluated a 2nd time later on and ii) we don't
      # want decisions like this to depend on the data or vector lengths since that can introduce
      # inconistency reminiscent of drop=TRUE in [.data.frame that we seek to avoid.
      with=FALSE
Run Code Online (Sandbox Code Playgroud)

基本上,"[.data.table"捕获传递给的表达式,j并根据一些预定义的规则决定如何处理它.如果满足其中一个规则,则使用标准评估设置with=FALSE哪些规则基本上意味着传递了列名j.

规则(大致)如下:

  1. with=FALSE,

    1.1.如果j表达式是一个呼叫而呼叫是:

    1.2.如果呼叫是的组合c("-","!")(:

    1.3.如果某个值(字符,整数,数字等)或..传递给j并且呼叫在 c("","c","paste","paste0","-","!")并且没有by呼叫

否则设定 with=TRUE

所以我们可以将它转换成一个函数并查看是否满足任何条件(我已经跳过转换.list函数,因为它在这里是无关紧要的.我们将直接测试list)

is_satisfied <- function(...) {
  jsub <- substitute(...)
  root = if (is.call(jsub)) as.character(jsub[[1L]])[1L] else ""
  if (root == ":" ||
    (root %chin% c("-","!") && 
     is.call(jsub[[2L]]) && 
     jsub[[2L]][[1L]]=="(" && 
     is.call(jsub[[2L]][[2L]]) && 
     jsub[[2L]][[2L]][[1L]]==":") ||
    ( (!length(av<-all.vars(jsub)) || all(substring(av,1L,2L)=="..")) &&
      root %chin% c("","c","paste","paste0","-","!"))) TRUE else FALSE
}

is_satisfied("x")
# [1] TRUE
is_satisfied(c("x", "y"))
# [1] TRUE
is_satisfied(..x)
# [1] TRUE
is_satisfied(1:2)
# [1] TRUE
is_satisfied(c(1:2))
# [1] TRUE
is_satisfied((1:2))
# [1] FALSE
is_satisfied(y)
# [1] FALSE
is_satisfied(list(x, y))
# [1] FALSE
Run Code Online (Sandbox Code Playgroud)