Jiq*_*ang 9 string split r comma
我不是R的新手,但我对正则表达式相对较新.
类似的问题可以在这里找到.
一个例子是我使用
> strsplit("UK, USA, Germany", ", ")
[[1]]
[1] "UK" "USA" "Germany"
Run Code Online (Sandbox Code Playgroud)
但我想得到
[[1]]
[1] "UK, USA" "Germany"
Run Code Online (Sandbox Code Playgroud)
另一个例子是
> strsplit("London, Washington, D.C., Berlin", ", ")
[[1]]
[1] "London" "Washington" "D.C." "Berlin"
Run Code Online (Sandbox Code Playgroud)
而且我想得到
[[1]]
[1] "London, Washington, D.C." "Berlin"
Run Code Online (Sandbox Code Playgroud)
绝对 华盛顿特区不应该分成两部分,只能用最后一个逗号分隔,而不是每个逗号.
我认为一种可行的方法是用其他东西替换最后一个逗号,例如
$, #, *, ...
Run Code Online (Sandbox Code Playgroud)
然后用
strsplit()
Run Code Online (Sandbox Code Playgroud)
用你替换的那个来分割字符串(确保它是唯一的!),但是如果你能直接使用一些内置函数处理问题,我会更高兴.
那我该怎么办呢?非常感谢
Tyl*_*ker 12
这是一种方法:
strsplit("UK, USA, Germany", ",(?=[^,]+$)", perl=TRUE)
## [[1]]
## [1] "UK, USA" " Germany"
Run Code Online (Sandbox Code Playgroud)
你可能想要:
strsplit("UK, USA, Germany", ",\\s*(?=[^,]+$)", perl=TRUE)
## [[1]]
## [1] "UK, USA" "Germany"
Run Code Online (Sandbox Code Playgroud)
如果逗号后面没有空格,它将匹配:
strsplit(c("UK, USA, Germany", "UK, USA,Germany"), ",\\s*(?=[^,]+$)", perl=TRUE)
## [[1]]
## [1] "UK, USA" "Germany"
##
## [[2]]
## [1] "UK, USA" "Germany"
Run Code Online (Sandbox Code Playgroud)
您可以使用包中的stri_split功能stringi
x <- "USA,UK,Poland"
stri_split_fixed(x,",") # standard split by comma
[[1]]
[1] "USA" "UK" "Poland"
stri_split_fixed(x,",",n = 2) # set the max number of elements
[[1]]
[1] "USA" "UK,Poland"
Run Code Online (Sandbox Code Playgroud)
不幸的是,没有参数来改变分裂的起点(从开始/结束),但我们可以用另一种方式处理 - 使用 stri_reverse
stri_split_fixed(stri_reverse(x),",",n = 2) #reverse
[[1]]
[1] "dnaloP" "KU,ASU"
stri_reverse(stri_split_fixed(stri_reverse(x),",",n = 2)[[1]]) #reverse back
[1] "Poland" "USA,UK"
stri_reverse(stri_split_fixed(stri_reverse(x),",",n = 2)[[1]])[2:1] #and again :)
[1] "USA,UK" "Poland"
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
6123 次 |
| 最近记录: |