从strsplit列表中提取向量而不使用循环

aga*_*tha 15 r

考虑以下向量:

[1] "1-1694429" "2-1546669" "3-928598"  "4-834486"  "5-802353"  "6-659439"  "7-552850" 
"8-516804"  "9-364061" 
[10] "10-354181" "11-335154" "12-257915" "13-251310" "14-232313" "15-217628" "16-216569"   
Run Code Online (Sandbox Code Playgroud)

我试图生成两个向量,每个向量包含通过分隔符" - "分割向量的每个元素而获得的值.

我用了:

f <- function(s) strsplit(s, "-")
cc<-sapply(names.reads, f)
Run Code Online (Sandbox Code Playgroud)

head(cc)$ 1-1694429 [1]"1""1694429"

$`2-1546669`

[1] "2"       "1546669"
Run Code Online (Sandbox Code Playgroud)

我知道我可以访问它们,如:

> cc[[1]][1]
[1] "1"

> cc[[1]][2]
[1] "1694429"
Run Code Online (Sandbox Code Playgroud)

我想有两个向量,每个向量包含存储在cc[[i]][1]和的值 cc[[i]][2]...我可以不使用循环吗?(我有超过100万个元素)

Hen*_*nry 21

使用mathematical.coffee的建议,以下代码避免循环或 sapply

names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
              "6-659439",  "7-552850",  "8-516804", "9-364061", "10-354181",
              "11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
              "16-216569")

cc       <- strsplit(names.reads,'-')
part1    <- unlist(cc)[2*(1:length(names.reads))-1]
part2    <- unlist(cc)[2*(1:length(names.reads))  ]
Run Code Online (Sandbox Code Playgroud)

产生

> part1
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15"
[16] "16"
> part2
 [1] "1694429" "1546669" "928598"  "834486"  "802353"  "659439"  "552850" 
 [8] "516804"  "364061"  "354181"  "335154"  "257915"  "251310"  "232313" 
[15] "217628"  "216569"
Run Code Online (Sandbox Code Playgroud)

虽然它确实要求每个原始值都是预期的格式.


ped*_*sso 7

使用sapply()(为了完整性):

y <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353", "6-659439", "7-552850", "8-516804", "9-364061", "10-354181", "11-335154", "12-257915", "13-251310", "14-232313", "15-217628", "16-216569")

正如@Bird在注释中指出的那样,该USE.NAMES参数可用于避免结果向量中的名称.

x <- sapply(y, function(x) strsplit(x, "-")[[1]], USE.NAMES=FALSE)

a <- x[1,]

b <- x[2,]


MYa*_*208 6

另一种方法:

names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
              "6-659439",  "7-552850",  "8-516804", "9-364061", "10-354181",
              "11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
              "16-216569")

library(reshape2)
colsplit(string=names.reads, pattern="-", names=c("Part1", "Part2"))

   Part1   Part2
1      1 1694429
2      2 1546669
3      3  928598
4      4  834486
5      5  802353
6      6  659439
7      7  552850
8      8  516804
9      9  364061
10    10  354181
11    11  335154
12    12  257915
13    13  251310
14    14  232313
15    15  217628
16    16  216569
Run Code Online (Sandbox Code Playgroud)


jtr*_*r13 6

或者使用purrr包:

第1部分:

> map(strsplit(names.reads, "-"), ~.x[1]) %>% unlist()
[1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13"
[14] "14" "15" "16"
Run Code Online (Sandbox Code Playgroud)

第2部分:

> map(strsplit(names.reads, "-"), ~.x[2]) %>% unlist()
[1] "1694429" "1546669" "928598"  "834486"  "802353"  "659439" 
[7] "552850"  "516804"  "364061"  "354181"  "335154"  "257915" 
[13] "251310"  "232313"  "217628"  "216569" 
Run Code Online (Sandbox Code Playgroud)