小编Ian*_*ang的帖子

通过lapply和正则表达式批量创建列到R的data.table中的列

我想在一些字符串之后获取值,演示如下

dt <- data.table(col.1 = c("a1, b2, c3, d4"))
x <- c("a", "b", "c")

dt[, (x) := lapply(FUN = str_match(string = .SD, 
                                   pattern = paste0("(?<=", x, ")([\\d])"))[, 2], 
                   X = x),
   .SDcols = "col.1"]
Run Code Online (Sandbox Code Playgroud)

理想的结果看起来像这样

desirable <- data.table(col.1 = c("a1, b2, c3, d4"),
                        a = c("1"),
                        b = c("2"),
                        c = c("3"))
Run Code Online (Sandbox Code Playgroud)

我收到如下错误消息:

*match.fun(FUN) 中的错误:

c("'str_match(string = .SD, pattern = paste0(\"(?<=\", x, \")([\\\\d])\"))[, ' is not a function, character or symbol", "'    2]' is not a function, character or symbol")* …
Run Code Online (Sandbox Code Playgroud)

regex r lapply data.table

2
推荐指数
1
解决办法
90
查看次数

通过 lapply 创建许多列并根据先前的列设置值

我有一张表格,其中描述了如下症状:

DT <- data.table(no = c(1, 2, 3),
symptom = c("headache and numbness", "tachycardia, sometimes headahce", "breath difficulty with limb numbness"))
Run Code Online (Sandbox Code Playgroud)

我关注的关键词是这样的

key.word <- list(
  head = c("head", "headache"),
  chest = c("breath", "tachycardia", "palpitaion")
Run Code Online (Sandbox Code Playgroud)

我想添加两列来描述变量中是否提到了该关键字症状,理想的结果如下所示

    result <- data.table(no = c(1, 2, 3),
                 symptom = c("headache and numbness", "tachycardia, sometimes headahce", "breath difficulty with limb numbness"),
                 head = c(T, T, F),
                 chest = c(F, T, T))
Run Code Online (Sandbox Code Playgroud)

我可以通过以下方式完成这项工作

DT[symptom %like% paste0(head, collapse = "|"), head := T]
DT[symptom %like% paste0(chest, …
Run Code Online (Sandbox Code Playgroud)

r lapply data.table

1
推荐指数
1
解决办法
58
查看次数

标签 统计

data.table ×2

lapply ×2

r ×2

regex ×1