使用来自另一列的位置在数据表中子集字符串列

Question

使用来自另一列的位置在数据表中子集字符串列

我有一个数据表,其中包含以下类型的多个列:

   attr1 attr2
1: 01001 01000
2: 11000 10000
3: 00100 00100
4: 01100 01000

DT = setDT(structure(list(attr1 = c("01001", "11000", "00100", "01100"), 
    attr2 = c("01000", "10000", "00100", "01000")), .Names = c("attr1", 
"attr2"), row.names = c(NA, -4L), class = "data.frame"))

Run Code Online (Sandbox Code Playgroud)

所有列都是字符串而不是数字.我想要实现的目标如下:

1)我想找到attr1字符串中出现"1"的位置

2)在这些位置取attr2的值

我在这种情况下的结果是:

[1] "10" "10" "1"  "10"

Run Code Online (Sandbox Code Playgroud)

作为第一行中的示例,attr1在位置2和5中具有"1",I在位置2和5中的第一行attr2并且以"10"结束.

我想要做的就是对列进行修改,然后使用它,但我真的希望有更好的方法.

Answer 1

the*_*ail 9

您可以使用@ alistaire regmatches答案的变体,因为还有替换功能regmatches<-.因此,不是提取1值,而是将值替换0为"":

dt[, matches := `regmatches<-`(attr2, gregexpr("0+", attr1), value="")]

#   attr1 attr2 matches
#1: 01001 01000      10
#2: 11000 10000      10
#3: 00100 00100       1
#4: 01100 01000      10

Run Code Online (Sandbox Code Playgroud)

您的想法strsplit和比较也是可行的:

dt[, matches := mapply(function(x,y) paste(y[x==1],collapse=""), strsplit(attr1,""), strsplit(attr2,""))]

Run Code Online (Sandbox Code Playgroud)

Answer 2

ali*_*ire 7

您可以使用基本R regmatches来提供不同的字符串以进行匹配和替换:

dt[, matches := sapply(regmatches(attr2, gregexpr('1+', attr1)), paste, collapse = '')][]
#>    attr1 attr2 matches
#> 1: 01001 01000      10
#> 2: 11000 10000      10
#> 3: 00100 00100       1
#> 4: 01100 01000      10

Run Code Online (Sandbox Code Playgroud)

数据

dt <- structure(list(attr1 = c("01001", "11000", "00100", "01100"), 
        attr2 = c("01000", "10000", "00100", "01000")), .Names = c("attr1", 
    "attr2"), row.names = c(NA, -4L), class = "data.frame")

setDT(dt)

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，9 月前
查看次数：	169 次
最近记录：	8 年，9 月前