有一个data.table
library(data.table)
car <- data.table(no = 1:100, turn = sample(1:5,100,replace = TRUE),
dis = sample(1:10,100,replace = TRUE))
Run Code Online (Sandbox Code Playgroud)
我想在第n次出现时将"dis"改为-1 turn == 3,比如第三次出现"turn" 3.
我可以选择第三行turn == 3:
car[turn == 3, .SD[3]]
Run Code Online (Sandbox Code Playgroud)
但是,我无法在此行更新"dis":
car[turn == 3, .SD[3]][, dis := -1]
Run Code Online (Sandbox Code Playgroud)
相关的问答:有条件地用data.table替换列值.
一些替代品.使用rowid或cumsum创建组内行的计数器.将计数器添加到您的病情中i.
我使用稍小的玩具数据集,只是为了更容易跟踪更改:
d <- data.table(x = 1:3, y = 1:12)
d[rowid(x) == 3 & x == 3, y := -1]
# @mt1022
d[cumsum(x == 3) == 3 & (x == 3), y := -1]
# @docendo discimus
d[(ix <- x == 3) & cumsum(ix) == 3, y := -1]
Run Code Online (Sandbox Code Playgroud)
虽然OP没有提到速度是一个问题,但我仍然很想在更大的矢量上计算不同的方法.不出所料,@ Frank的方法是最快的,特别是当搜索的唯一值的数量增加时:
frank << docendo < henrik < mt022
microbenchmark(henrik = d[rowid(x) == 3 & x == 3, y := -1],
mt1022 = d[cumsum(x == 3) == 3 & (x == 3), y := -1],
docendo = d[(ix <- x == 3) & cumsum(ix) == 3, y := -1],
frank = d[d[x == 3, which = TRUE][3], y := -1], unit = "relative")
d <- data.table(x = sample(1:3, 1e6, replace = TRUE), y = 1:1e6)
# Unit: relative
# expr min lq mean median uq max neval cld
# henrik 4.417303 4.369407 4.133514 4.319839 4.329658 1.260394 100 b
# mt1022 5.461961 5.285562 5.174559 5.186404 5.239738 1.608712 100 c
# docendo 3.572646 3.624369 3.788678 3.589705 3.576637 1.733272 100 b
# frank 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 100 a
d <- data.table(x = sample(1:30, 1e6, replace = TRUE), y = 1:1e6)
# Unit: relative
# expr min lq mean median uq max neval cld
# henrik 22.64881 19.54375 18.81963 18.91335 19.78559 5.507692 100 bc
# mt1022 24.58258 21.17535 19.84417 20.96256 22.76020 3.625263 100 c
# docendo 19.40044 16.75912 16.23321 16.47953 18.06264 4.234100 100 b
# frank 1.00000 1.00000 1.00000 1.00000 1.00000 1.000000 100 a
d <- data.table(x = sample(1:300, 1e6, replace = TRUE), y = 1:1e6)
# Unit: relative
# expr min lq mean median uq max neval cld
# henrik 31.81237 32.51122 28.79490 30.35766 28.63560 8.236282 100 b
# mt1022 34.71984 35.45341 33.20405 33.57394 31.50914 21.556367 100 c
# docendo 27.99046 28.15855 26.56954 26.60644 25.20044 7.847163 100 b
# frank 1.00000 1.00000 1.00000 1.00000 1.00000 1.000000 100 a
# Unit: milliseconds
# expr min lq mean median uq max neval cld
# henrik 60.655582 76.455531 83.061266 77.632036 78.57818 203.224042 100 c
# mt1022 66.701182 84.133034 87.967300 84.937201 85.72464 201.167914 100 c
# docendo 52.938545 67.214360 71.558130 68.003891 68.51897 184.178346 100 b
# frank 1.977821 2.494039 2.629852 2.663577 2.76089 3.613905 100 a
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
130 次 |
| 最近记录: |