根据逗号分割数据框列

Fre*_*ons 2 r dplyr tidyr

我有一个具有以下结构的数据框,标题为“final_proj_data”

ID          County              Population     Year  
<dbl>       <chr>               <dbl>          <dbl>    
1003    Baldwin County, Alabama 169162         2006     
1015    Calhoun County, Alabama 112903         2006     
1043    Cullman County, Alabama 80187          2006     
1049    DeKalb County, Alabama  68014          2006 
Run Code Online (Sandbox Code Playgroud)

我试图将“县”列拆分为两个不同的列“县”和“州”,并删除逗号。

我尝试了 split() 函数的多种排列,但我不断收到此错误:

错误:var必须计算为单个数字或列名称,而不是字符向量

我已经尝试过(除其他外)

  final_proj_data %>% 
separate(final_proj_data$County, c("State", "County"), sep = ",", remove = TRUE)
    final_proj_data %>% 
separate(data = final_proj_data, col = County,
 into = c("State", "County"), sep = ",")
Run Code Online (Sandbox Code Playgroud)

我不确定我做错了什么,或者为什么“col =”不断抛出此错误。任何帮助,将不胜感激!

Nel*_*Gon 5

使用dplyr和基础 R:

library(dplyr)
 final_proj_data %>% 
 mutate(State=unlist(lapply(strsplit(County,", "),function(x) x[2])),
       County=gsub(",.*","",County))
    ID         County Population Year   State
1 1003 Baldwin County     169162 2006 Alabama
2 1015 Calhoun County     112903 2006 Alabama
3 1043 Cullman County      80187 2006 Alabama
4 1049  DeKalb County      68014 2006 Alabama
Run Code Online (Sandbox Code Playgroud)

原来的

和(刚刚看到@Ronak Shah 上面也有同样的评论)dplyrtidyr

library(dplyr)
library(tidyr)
final_proj_data %>% 
   separate(County,c("County","State"),sep=",")
    ID         County    State Population Year
1 1003 Baldwin County  Alabama     169162 2006
2 1015 Calhoun County  Alabama     112903 2006
3 1043 Cullman County  Alabama      80187 2006
4 1049  DeKalb County  Alabama      68014 2006
Run Code Online (Sandbox Code Playgroud)