省略 NA 的高效逐行字符串连接

JDG*_*JDG 4 string r na data.table

我有data.table以下示例结构:

dt = data.table(
  V1 = c('One', 'Two', 'Three'),
  V2 = c('Red', NA, NA),
  V3 = c('Cat', NA, 'Dogs')
)

> dt
      V1   V2   V3
1:   One  Red  Cat
2:   Two <NA> <NA>
3: Three <NA> Dogs
Run Code Online (Sandbox Code Playgroud)

我想将元素按行串联到一个新列中,该列省略NA

> dt
      V1   V2   V3          V4
1:   One  Red  Cat One Red Cat
2:   Two <NA> <NA>         Two
3: Three <NA> Dogs  Three Dogs
Run Code Online (Sandbox Code Playgroud)

在这个简单的示例中,我当然可以转置对象并执行lapply(.SD, paste(x[!is.na(x)])),但是转置的计算成本太高。我也不想在第二步中删除NAs那些被强迫的内容。character简而言之,我欢迎任何高性能解决方案。

jay*_*.sf 5

使用pastein删除s、do.call双倍空格和单倍空格。replaceNAgsubtrimws

dt[, V4 := trimws(gsub('  ', ' ', do.call(paste, replace(.SD, is.na(.SD), '')), fixed=TRUE))]
#       V1   V2   V3          V4
# 1:   One  Red  Cat One Red Cat
# 2:   Two <NA> <NA>         Two
# 3: Three <NA> Dogs  Three Dogs
Run Code Online (Sandbox Code Playgroud)