我有一个类似于以下内容的字符串
my_string <- "apple,banana,orange,"
Run Code Online (Sandbox Code Playgroud)
我想分割,以产生输出:
list(c('apple', 'banana', 'orange', ""))
Run Code Online (Sandbox Code Playgroud)
我认为 strsplit 可以实现这一点,但它对待尾随的 ',' 就像它不存在一样
my_string <- "apple,banana,orange,"
Run Code Online (Sandbox Code Playgroud)
由reprex 包(v2.0.1)于 2023 年 11 月 15 日创建
实现所需输出的最简单方法是什么?
更多带有示例字符串和所需输出的测试用例
string1 = "apple,banana,orange,"
output1 = list(c('apple', 'banana', 'orange', ''))
string2 = "apple,banana,orange,pear"
output2 = list(c('apple', 'banana', 'orange', 'pear'))
string3 = ",apple,banana,orange"
output3 = list(c('', 'apple', 'banana', 'orange'))
## Examples of non-comma separated strings
# '|' separator
string4 = "|apple|banana|orange|"
output4 = list(c('', 'apple', 'banana', 'orange', ''))
# 'x' separator
string5 = "xapplexbananaxorangex"
output5 = list(c('', 'apple', 'banana', 'orange', ''))
Run Code Online (Sandbox Code Playgroud)
编辑:
理想的解决方案应该推广到任何分裂字符
还更喜欢 base-R 解决方案(尽管仍然链接提供此功能的任何包,因为它们的源代码可能有助于查看!)
Tho*_*ing 14
strsplit没有给出期望的输出?当您输入时?strsplit,您将看到以下语句
请注意,这意味着如果在(非空)字符串的开头有匹配,则输出的第一个元素是“”,但如果在字符串的末尾有匹配,则输出是相同的与删除匹配项一样。
""这就是您在使用 时看不到尾随的原因strsplit。
下面是一些演示
> strsplit("apple,banana,orange,", ",")
[[1]]
[1] "apple" "banana" "orange"
> strsplit(",apple,banana,orange,", ",")
[[1]]
[1] "" "apple" "banana" "orange"
> strsplit(",apple,banana,orange", ",")
[[1]]
[1] "" "apple" "banana" "orange"
> strsplit("apple,banana,orange", ",")
[[1]]
[1] "apple" "banana" "orange"
Run Code Online (Sandbox Code Playgroud)
如果您想进行编码练习,一个基本 R 选项可以定义一个自定义函数(递归),如下所示
f <- function(x, sep = ",") {
pat <- sprintf("^(.*?)%s.*", sep)
s1 <- sub(pat, "\\1", x)
s2 <- sub(paste0("^.*?", sep), "", x)
if (s2 == x) {
return(x)
}
c(s1, Recall(s2, sep))
}
Run Code Online (Sandbox Code Playgroud)
substr或带有+的变体regexpr
f <- function(x, sep = ",") {
idx <- regexpr(sep, x)
s1 <- substr(x, 1, idx - 1)
s2 <- substr(x, idx + 1, nchar(x))
if (s2 == x) {
return(x)
}
c(s1, Recall(s2, sep))
}
Run Code Online (Sandbox Code Playgroud)
这样
> f("apple,banana,orange,")
[1] "apple" "banana" "orange" ""
> f(",apple,banana,orange,")
[1] "" "apple" "banana" "orange" ""
> f(",apple,banana,orange")
[1] "" "apple" "banana" "orange"
> f("apple,banana,orange")
[1] "apple" "banana" "orange"
Run Code Online (Sandbox Code Playgroud)
Gue*_*sBF 12
使用纵梁
library(stringr)
str_split(my_string, ",")
[[1]]
[1] "apple" "banana" "orange" ""
Run Code Online (Sandbox Code Playgroud)
the*_*ail 12
在末尾粘贴另一个分隔符应该可以按strsplit预期运行。
否则,您可以回退到使用该scan函数,该read.csv/table函数支撑着这些函数:
strsplit(paste0(string1, ","), ",")
##[[1]]
##[1] "apple" "banana" "orange" ""
Run Code Online (Sandbox Code Playgroud)
一般考虑正则表达式替换:
L <- list(string1, string2, string3, string4, string5)
mapply(
function(x,s) strsplit(paste0(x, gsub("\\\\", "", s)), split=s),
L,
c(",", ",", ",", "\\|", "x")
)
##[[1]]
##[1] "apple" "banana" "orange" ""
##
##[[2]]
##[1] "apple" "banana" "orange" "pear"
##
##[[3]]
##[1] "" "apple" "banana" "orange"
##
##[[4]]
##[1] "" "apple" "banana" "orange" ""
##
##[[5]]
##[1] "" "apple" "banana" "orange" ""
Run Code Online (Sandbox Code Playgroud)
scan选项:
scan(text=string1, sep=",", what="")
##Read 4 items
##[1] "apple" "banana" "orange" ""
Run Code Online (Sandbox Code Playgroud)
概括:
mapply(
function(x,s) scan(text=x, sep=s, what=""),
L,
c(",", ",", ",", "|", "x")
)
Run Code Online (Sandbox Code Playgroud)