考虑以下向量:
[1] "1-1694429" "2-1546669" "3-928598" "4-834486" "5-802353" "6-659439" "7-552850"
"8-516804" "9-364061"
[10] "10-354181" "11-335154" "12-257915" "13-251310" "14-232313" "15-217628" "16-216569"
Run Code Online (Sandbox Code Playgroud)
我试图生成两个向量,每个向量包含通过分隔符" - "分割向量的每个元素而获得的值.
我用了:
f <- function(s) strsplit(s, "-")
cc<-sapply(names.reads, f)
Run Code Online (Sandbox Code Playgroud)
head(cc)$
1-1694429[1]"1""1694429"
$`2-1546669`
[1] "2" "1546669"
Run Code Online (Sandbox Code Playgroud)
我知道我可以访问它们,如:
> cc[[1]][1]
[1] "1"
> cc[[1]][2]
[1] "1694429"
Run Code Online (Sandbox Code Playgroud)
我想有两个向量,每个向量包含存储在cc[[i]][1]和的值 cc[[i]][2]...我可以不使用循环吗?(我有超过100万个元素)
Hen*_*nry 21
使用mathematical.coffee的建议,以下代码避免循环或 sapply
names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
"6-659439", "7-552850", "8-516804", "9-364061", "10-354181",
"11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
"16-216569")
cc <- strsplit(names.reads,'-')
part1 <- unlist(cc)[2*(1:length(names.reads))-1]
part2 <- unlist(cc)[2*(1:length(names.reads)) ]
Run Code Online (Sandbox Code Playgroud)
产生
> part1
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15"
[16] "16"
> part2
[1] "1694429" "1546669" "928598" "834486" "802353" "659439" "552850"
[8] "516804" "364061" "354181" "335154" "257915" "251310" "232313"
[15] "217628" "216569"
Run Code Online (Sandbox Code Playgroud)
虽然它确实要求每个原始值都是预期的格式.
使用sapply()(为了完整性):
y <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353", "6-659439", "7-552850", "8-516804", "9-364061",
"10-354181", "11-335154", "12-257915", "13-251310", "14-232313", "15-217628", "16-216569")
正如@Bird在注释中指出的那样,该USE.NAMES参数可用于避免结果向量中的名称.
x <- sapply(y, function(x) strsplit(x, "-")[[1]], USE.NAMES=FALSE)
a <- x[1,]
b <- x[2,]
另一种方法:
names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
"6-659439", "7-552850", "8-516804", "9-364061", "10-354181",
"11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
"16-216569")
library(reshape2)
colsplit(string=names.reads, pattern="-", names=c("Part1", "Part2"))
Part1 Part2
1 1 1694429
2 2 1546669
3 3 928598
4 4 834486
5 5 802353
6 6 659439
7 7 552850
8 8 516804
9 9 364061
10 10 354181
11 11 335154
12 12 257915
13 13 251310
14 14 232313
15 15 217628
16 16 216569
Run Code Online (Sandbox Code Playgroud)
或者使用purrr包:
第1部分:
> map(strsplit(names.reads, "-"), ~.x[1]) %>% unlist()
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13"
[14] "14" "15" "16"
Run Code Online (Sandbox Code Playgroud)
第2部分:
> map(strsplit(names.reads, "-"), ~.x[2]) %>% unlist()
[1] "1694429" "1546669" "928598" "834486" "802353" "659439"
[7] "552850" "516804" "364061" "354181" "335154" "257915"
[13] "251310" "232313" "217628" "216569"
Run Code Online (Sandbox Code Playgroud)