将多个列粘贴到单个列中,但删除任何NA,空白或重复值

SCD*_*DCE 3 r dplyr

我的数据看起来像这样:

dat <- data.frame(SOURCES1 = c("123 Name, 123 Rd, City, State", 
                               "354 Name, 354 Rd, City, State",
                               NA,"",""),
                  SOURCES2 = c("","",
                               "321 Name, 321 Rd, City, State", 
                               "678 Name, 678 Rd, City, State",
                               ""),
                  SOURCES3 = c("","",NA,
                               "678 Name, 678 Rd, City, State", 
                               NA),
                  SOURCES4 = c("","","",NA,NA),
                  SOURCES5 = c("","","",NA,NA))
Run Code Online (Sandbox Code Playgroud)

我正在寻找一个看起来像这样的列:

"123 Name, 123 Rd, City, State"
"354 Name, 354 Rd, City, State"
"321 Name, 321 Rd, City, State"
"678 Name, 678 Rd, City, State"
NA
Run Code Online (Sandbox Code Playgroud)

akr*_*run 5

我们可以coalesce在将空格("")转换为NA

library(tidyverse)
dat %>% 
   mutate_all(funs(na_if(as.character(.), ''))) %>% 
   transmute(SOURCE = coalesce(!!! rlang::syms(names(.))))
#                         SOURCE
#1 123 Name, 123 Rd, City, State
#2 354 Name, 354 Rd, City, State
#3 321 Name, 321 Rd, City, State
#4 678 Name, 678 Rd, City, State
#5                          <NA>   
Run Code Online (Sandbox Code Playgroud)

或者使用invokepurrr

dat %>% 
   mutate_all(funs(na_if(as.character(.), ''))) %>% 
   transmute(SOURCE = invoke(coalesce, .))
#                         SOURCE
#1 123 Name, 123 Rd, City, State
#2 354 Name, 354 Rd, City, State
#3 321 Name, 321 Rd, City, State
#4 678 Name, 678 Rd, City, State
#5                          <NA>
Run Code Online (Sandbox Code Playgroud)

或者pnax来自base R

do.call(pmax, c(lapply(dat, function(x) replace(as.character(x), 
          x=="", NA)), na.rm = TRUE))             
Run Code Online (Sandbox Code Playgroud)