使用mean()和sum()/.N时,data.table按组平均提供不同的结果

Fab*_*rea 3 r data.table

当在 data.table 中按组计算平均值时,我得到不同的结果:

\n
qty <- c(1:6)\nname <- c("a", "b","a", "a", "c","b" )\ntype <- c("i", "i", "i", "f", "f", "f")\n\nDT <- data.table(qty,name,type) \n\nDT[, avg_mean  := mean(qty)   , by = .(name, type)]\nDT[, avg_sum_N := sum(qty)/.N , by = .(name, type)]\n\n > DT\n     qty   name   type avg_mean avg_sum_N\n   <int> <char> <char>    <num>     <num>\n1:     1      a      i        2         2\n2:     2      b      i        4         2\n3:     3      a      i        2         2\n4:     4      a      f        2         4\n5:     5      c      f        6         5\n6:     6      b      f        5         6\n
Run Code Online (Sandbox Code Playgroud)\n

我期望avg_meanavg_sum_N会完全相同,例如avg_sum_N。\n为什么它们不同?谢谢。

\n

请查找以下会话信息。

\n
> packageVersion('data.table')\n[1] \xe2\x80\x981.14.3\xe2\x80\x99\n> sessionInfo()\nR version 4.1.0 (2021-05-18)\nPlatform: x86_64-w64-mingw32/x64 (64-bit)\nRunning under: Windows 10 x64 (build 19044)\n\nMatrix products: default\n\nlocale:\n[1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252    LC_MONETARY=Portuguese_Brazil.1252\n[4] LC_NUMERIC=C                       LC_TIME=Portuguese_Brazil.1252    \n\nattached base packages:\n[1] stats     graphics  grDevices utils     datasets  methods   base     \n\nother attached packages:\n [1] zoo_1.8-10        lubridate_1.8.0   RPostgres_1.4.3   DBI_1.1.2         stringi_1.7.6     readxl_1.4.0     \n [7] gsubfn_0.7        proto_1.0.0       stringr_1.4.0     magrittr_2.0.3    stringdist_0.9.8  fuzzyjoin_0.1.6  \n[13] data.table_1.14.3\n\nloaded via a namespace (and not attached):\n [1] Rcpp_1.0.8.3     pillar_1.7.0     compiler_4.1.0   cellranger_1.1.0 tools_4.1.0      bit_4.0.4       \n [7] lattice_0.20-44  lifecycle_1.0.1  tibble_3.1.6     pkgconfig_2.0.3  rlang_1.0.2      cli_3.2.0       \n[13] rstudioapi_0.13  writexl_1.4.0    parallel_4.1.0   dplyr_1.0.8      hms_1.1.1        generics_0.1.2  \n[19] vctrs_0.4.1      grid_4.1.0       bit64_4.0.5      tidyselect_1.1.2 glue_1.6.2       R6_2.5.1        \n[25] fansi_1.0.3      tcltk_4.1.0      blob_1.2.3       purrr_0.3.4      ellipsis_0.3.2   assertthat_0.2.1\n[31] utf8_1.2.2       crayon_1.5.1\n
Run Code Online (Sandbox Code Playgroud)\n

Fab*_*rea 5

问题是 dev data.table 版本中的错误。 data.table::update.dev.pkg()解决了问题。