Jos*_*e R 6 r paste dataframe na
我有一个包含城市,州和国家列的数据框.我想创建一个串联的字符串:"城市,州,国家".但是,我的一个城市没有州(有一个国家NA).我希望那个城市的字符串是"城市,乡村".以下是创建错误字符串的代码:
# define City, State, Country
city <- c("Austin", "Knoxville", "Salk Lake City", "Prague")
state <- c("Texas", "Tennessee", "Utah", NA)
country <- c("United States", "United States", "United States", "Czech Rep")
# create data frame
dff <- data.frame(city, state, country)
# create full string
dff["string"] <- paste(city, state, country, sep=", ")
Run Code Online (Sandbox Code Playgroud)
当我显示时dff$string,我得到以下内容.请注意,最后一个字符串有一个NA,,不需要:
> dff["string"]
string
1 Austin, Texas, United States
2 Knoxville, Tennessee, United States
3 Salk Lake City, Utah, United States
4 Prague, NA, Czech Rep
Run Code Online (Sandbox Code Playgroud)
我该怎么做才能跳过这一点NA,,包括sep = ", ".
另一种方法是事后修复它:
gsub("NA, ","",dff$string)
#[1] "Austin, Texas, United States"
#[2] "Knoxville, Tennessee, United States"
#[3] "Salk Lake City, Utah, United States"
#[4] "Prague, Czech Rep"
Run Code Online (Sandbox Code Playgroud)
备选方案#2,一旦你有你的data.frame被叫,就要使用申请dff:
apply(dff, 1, function(x) paste(na.omit(x),collapse=", ") )
Run Code Online (Sandbox Code Playgroud)
虽然迟到了,但unite提供了一种一步法:
dff %>% unite("string", c(city, state, country), sep=", ", remove = FALSE, na.rm = TRUE)
Run Code Online (Sandbox Code Playgroud)
string city state country
1 Austin, Texas, United States Austin Texas United States
2 Knoxville, Tennessee, United States Knoxville Tennessee United States
3 Salk Lake City, Utah, United States Salk Lake City Utah United States
4 Prague, Czech Rep Prague <NA> Czech Rep
Run Code Online (Sandbox Code Playgroud)