Ran*_*hem 3 join r left-join dplyr
所以我一直在尝试使用left_join将新数据集的列放到我的主数据集(称为employee)上
我已经仔细检查了矢量名称和我没有的清洁,似乎没有任何效果.这是我的代码.非常感谢任何帮助.
job_codes <- read_csv("Quest_UMMS_JobCodes.csv")
job_codes <- job_codes %>%
clean_names() %>%
select(job_code, pos_desc = pos_des_desc)
job_codes$is_nurse <- str_detect(tolower(job_codes$pos_desc), "nurse")
employee <- employee %>%
left_join(job_codes, by = "job_code")
Run Code Online (Sandbox Code Playgroud)
我不断得到的错误:eval中的错误(替换(expr),envir,enclos):在rhs中找不到'job_code'列,无法加入
这是结果
names(job_code)
> names(job_codes)
[1] "job_code" "pos_desc" "is_nurse"
names(employee)
> names(employee)
[1] "REC_NUM" "ZIP" "STATE"
[4] "SEX" "EEO_CLASS" "BIRTH_YEAR"
[7] "EMP_STATUS" "PROCESS_LEVEL" "DEPARTMENT"
[10] "JOB_CODE" "UNION_CODE" "SUPERVISOR"
[13] "DATE_HIRED" "R_SHIFT" "SALARY_CLASS"
[16] "EXEMPT_EMP" "PAY_RATE" "ADJ_HIRE_DATE"
[19] "ANNIVERS_DATE" "TERM_DATE" "NBR_FTE"
[22] "PENSION_PLAN" "PAY_GRADE" "SCHEDULE"
[25] "OT_PLAN_CODE" "DECEASED" "POSITION"
[28] "WORK_SCHED" "SUPERVISOR_IND" "FTE_TOTAL"
[31] "PRO_RATE_TOTAL" "PRO_RATE_A_SAL" "NEW_HIRE_DATE"
[34] "COUNTY" "FST_DAY_WORKED" "date_hired"
[37] "date_hired_adj" "term_date" "employment_duration"
[40] "current" "age" "emp_duration_years"
[43] "DESCRIPTION.x" "PAY_STATUS.x" "DESCRIPTION.y"
[46] "PAY_STATUS.y"
Run Code Online (Sandbox Code Playgroud)
Uwe*_*Uwe 11
现在,在OP在Q中添加了两个表的列名之后,很明显连接的列以不同的方式写入(大写与小写).
如果列名不同,help("left_join")
建议:
要通过x和y上的不同变量连接,请使用命名向量.例如,by = c("a"="b")将匹配xa到yb
所以,在这种情况下应该阅读
employee <- employee %>% left_join(job_codes, by = c("JOB_CODE" = "job_code"))
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
25450 次 |
最近记录: |