在R中拆分CamelCase

kma*_*ace 11 split camelcasing r

有没有办法在R中分割驼峰案例字符串?

我试过了:

string.to.split = "thisIsSomeCamelCase"
unlist(strsplit(string.to.split, split="[A-Z]") )
# [1] "this" "s"    "ome"  "amel" "ase" 
Run Code Online (Sandbox Code Playgroud)

42-*_*42- 13

string.to.split = "thisIsSomeCamelCase"
gsub("([A-Z])", " \\1", string.to.split)
# [1] "this Is Some Camel Case"

strsplit(gsub("([A-Z])", " \\1", string.to.split), " ")
# [[1]]
# [1] "this"  "Is"    "Some"  "Camel" "Case" 
Run Code Online (Sandbox Code Playgroud)

看看Ramnath和我的,我可以说我最初的印象是这是一个未说明的问题得到了支持.

并指出Tommy和Ramanth赞成指出 [:upper:]

strsplit(gsub("([[:upper:]])", " \\1", string.to.split), " ")
# [[1]]
# [1] "this"  "Is"    "Some"  "Camel" "Case" 
Run Code Online (Sandbox Code Playgroud)


Ram*_*ath 11

这是一种方法

split_camelcase <- function(...){
  strings <- unlist(list(...))
  strings <- gsub("^[^[:alnum:]]+|[^[:alnum:]]+$", "", strings)
  strings <- gsub("(?!^)(?=[[:upper:]])", " ", strings, perl = TRUE)
  return(strsplit(tolower(strings), " ")[[1]])
}

split_camelcase("thisIsSomeGood")
# [1] "this" "is"   "some" "good"
Run Code Online (Sandbox Code Playgroud)

  • 因为这适用于国际大写字母(不仅仅是AZ) - 例如,"enÖIHavet"` (5认同)

Tyl*_*ker 6

这是使用单个正则表达式(Lookahead和Lookbehind)的方法:

strsplit(string.to.split, "(?<=[a-z])(?=[A-Z])", perl = TRUE)

## [[1]]
## [1] "this"  "Is"    "Some"  "Camel" "Case" 
Run Code Online (Sandbox Code Playgroud)