Nat*_*a P 5 r separator strsplit
我有一个具有不同名称的列:
X <- c("Ashley, Tremond WILLIAMS, Carla", "Claire, Daron", "Luw, Douglas CANSLER, Stephan")
Run Code Online (Sandbox Code Playgroud)
在第二个空格之后,它开始第二个人的名字。例如,Ashley、Tremond 是一个人,WILLIAMS、Carla 是另一个人。
我努力了:
strsplit(X, "\\,\\s|\\,|\\s")
Run Code Online (Sandbox Code Playgroud)
但它除以所有空格,所以我得到:
strsplit(X, "\\,\\s|\\,|\\s")
[[1]]
[1] "Ashley" "Tremond" "WILLIAMS" "Carla"
[[2]]
[1] "Claire" "Daron"
[[3]]
[1] "Luw" "Douglas" "CANSLER" "Stephan"
Run Code Online (Sandbox Code Playgroud)
我怎样才能只在第一个空格之后分开,这样我就明白了?:
[1] "Ashley, Tremond" "WILLIAMS, Carla"
[[2]]
[1] "Claire, Daron"
[[3]]
[1] "Luw, Douglas" "CANSLER, Stephan"
Run Code Online (Sandbox Code Playgroud)
预先感谢您的所有帮助
当然@ytk的评论是有效的,但是如果你想避免正则表达式,你可以偷偷摸摸地做
df2 <- df %>%
separate(col = X, into=c("person1a","person1b","person2a","person2b"),sep= " ") %>%
unite(col = "person1", person1a, person1b, sep=" ") %>%
unite(col = "person2", person2a, person2b, sep=" ")
Run Code Online (Sandbox Code Playgroud)
返回:
> df2
person1 person2
1 Ashley, Tremond WILLIAMS, Carla
2 Claire, Daron NA NA
3 Luw, Douglas CANSLER, Stephan
Run Code Online (Sandbox Code Playgroud)
ps 我用来df <- data.frame(X = c("Ashley, Tremond WILLIAMS, Carla", "Claire, Daron", "Luw, Douglas CANSLER, Stephan"))将输入输入到数据框中。