通过lapply和正则表达式批量创建列到R的data.table中的列

Ian*_*ang 2 regex r lapply data.table

我想在一些字符串之后获取值,演示如下

dt <- data.table(col.1 = c("a1, b2, c3, d4"))
x <- c("a", "b", "c")

dt[, (x) := lapply(FUN = str_match(string = .SD, 
                                   pattern = paste0("(?<=", x, ")([\\d])"))[, 2], 
                   X = x),
   .SDcols = "col.1"]
Run Code Online (Sandbox Code Playgroud)

理想的结果看起来像这样

desirable <- data.table(col.1 = c("a1, b2, c3, d4"),
                        a = c("1"),
                        b = c("2"),
                        c = c("3"))
Run Code Online (Sandbox Code Playgroud)

我收到如下错误消息:

*match.fun(FUN) 中的错误:

c("'str_match(string = .SD, pattern = paste0(\"(?<=\", x, \")([\\\\d])\"))[, ' is not a function, character or symbol", "'    2]' is not a function, character or symbol")*
Run Code Online (Sandbox Code Playgroud)

但我不知道如何解决这个问题。谁能给我一些hins吗?

akr*_*run 6

循环模式并提取值str_match

library(data.table)
library(stringr)
dt[, (x) := lapply(paste0("(?<=", x, ")(\\d+)"),
     \(x) str_match(col.1, x)[, 2])]
            col.1 a b c
1: a1, b2, c3, d4 1 2 3
Run Code Online (Sandbox Code Playgroud)

或者与strcapture

pat <- paste0(sprintf("%s(\\d+)", x), collapse = ".*")
cbind(dt, dt[, strcapture(pat, col.1, setNames(rep(list(integer()), 3), x))])
            col.1 a b c
1: a1, b2, c3, d4 1 2 3
Run Code Online (Sandbox Code Playgroud)