保存`mutate()`的结果而不重新分配

han*_*101 2 r dplyr data.table

目前正致力于了解dplyr和整体tidyverse更好,现在我偶然发现了多种方式存储mutate呼叫结果.我想知道添加额外列的可能方法之一是更好还是更差.

library(data.table)
library(dplyr)
dt <- structure(list(obs = c("1953M04", "1953M05", "1953M06", "1953M07", "1953M08", "1953M09", "1953M10", "1953M11", "1953M12", "1954M01")
               , gs1 = c(2.35999989509583, 2.48000001907349, 2.45000004768372, 2.38000011444092, 2.27999997138977, 2.20000004768372, 1.78999996185303, 
           1.66999995708466, 1.6599999666214, 1.4099999666214)), row.names = c(NA, -10L), class = c("data.table", "data.frame"))

# Data.Table approach
dt[, Date.Month := as.Date(paste0(obs,"-01"), format = "%YM%m-%d")]

# dplyr-way in a logic way at the end of the pipe
dt %>% mutate( Date.Month = as.Date(paste0(obs,"-01"), format = "%YM%m-%d")) %>% {. ->> dt }

# Direct reassignment, but it's kind of illogic to assign on the left the output from the right, at least in my head ;-)
dt <- dt %>% mutate( Date.Month = as.Date(paste0(obs,"-01"), format = "%YM%m-%d"))
Run Code Online (Sandbox Code Playgroud)

最后一个版本的重新分配在计算工作方面是否代价高昂?

akr*_*run 5

一个选项是复合赋值operator(%<>%)运算符magrittr

library(magrittr)
library(dplyr)
dt %<>% 
    mutate( Date.Month = as.Date(paste0(obs,"-01"), format = "%YM%m-%d"))
Run Code Online (Sandbox Code Playgroud)

但是,data.table赋值运算符(:=)将更快更有效