C8H*_*4O2 5 sorting row r string-concatenation mapply
(相关问题不包括排序.paste当您不需要排序时,它很容易使用.)
我有一个不太理想的结构表,其字符列是通用的"item1","item2"等.我想创建一个新的字符变量,它是这些列的按字母顺序排列的逗号分隔串联.例如,在第5行中,如果item1 ="milk",item2 ="eggs",item3 ="butter",则第5行中的新变量可能是"butter,eggs,milk"
我写了一个函数f(),它适用于两个字符变量.但是,我遇到了麻烦
mapply或其他"矢量化"(我知道它实际上只是一个for循环)任何帮助非常感谢.
df <- data.frame(a =c("foo","bar"),
b= c("baz","qux"))
paste(df$a,df$b, sep=", ")
# returns [1] "foo, baz" "bar, qux" ... but I want [1] "baz, foo" "bar, qux"
f <- function(a,b) paste(c(a,b)[order(c(a,b))],collapse=", ")
f("foo","baz")
# returns [1] "baz, foo" ... which is what I want ... how to vectorize?
df$new_var <- mapply(f, df$a, df$b)
df
# a b new_var <- new_var is not what I want
# 1 foo baz 1, 2
# 2 bar qux 1, 2
# Interestingly, data.table is smart enough to fix my bad mapply
library(data.table)
dt <- data.table(a =c("foo","bar"),
b= c("baz","qux"))
dt[,new_var:=mapply(f, a, b)]
dt
# a b new_var <- new var IS what I want
# 1: foo baz baz, foo
# 2: bar qux bar, qux
Run Code Online (Sandbox Code Playgroud)
只需应用向下行:
apply(df,1,function(x){
paste(sort(x),collapse = ",")
})
Run Code Online (Sandbox Code Playgroud)
如果需要,请将其包装在一个函数中。您必须定义要发送的列或假设所有列。即应用(df[,2:3],1,f()...
sort(x) 与 x[order(x)] 相同
我的第一个想法是这样做:
dt[, new_var := paste(sort(.SD), collapse = ", "), by = 1:nrow(dt)]
Run Code Online (Sandbox Code Playgroud)
但是您可以通过一些简单的修改来使您的函数正常工作:
f = function(...) paste(c(...)[order(c(...))],collapse=", ")
dt[, new_var := do.call(function(...) mapply(f, ...), .SD)]
Run Code Online (Sandbox Code Playgroud)