The*_*Man 10 merge r data.table
suffixes在merge仅公共列名的作品.无论如何还要将其扩展到其余列,而无需在合并之前手动更新列?
那是 -
df1 <- data.table(
a = c(1,2,3,4,5,6),
b = c('a','b','f','e','r','h'),
d = c('q','l','o','n','q','z')
)
df2 <- data.table(
a = c(1,2,3,4,5,6),
d = c('q','l','o','n','q','z')
)
colnames(merge(df1,df2, by = 'a', suffixes = c("1","2")))
#[1] "a" "b" "d1" "d2" what it does
#[1] "a" "b1" "d1" "d2" what I'd like it to do
Run Code Online (Sandbox Code Playgroud)
我正在处理的这种方式类似于@ mrip的答案.
df1 <- data.table(
a = c(1,2,3,4,5,6),
b = c('a','b','f','e','r','h'),
r = c('a','b','f','e','r','h'),
d = c('q','l','o','n','q','z')
)
df2 <- data.table(
a = c(1,2,3,4,5,6),
c = c('a','b','f','e','r','h'),
q = c('a','b','f','e','r','h'),
d = c('q','l','o','n','q','z')
)
dfmerge <- (merge(df1,df2, by = c("a"), suffixes = c("1","2")))
setnames(
dfmerge,
setdiff(names(df1),names(df2)),
paste0(setdiff(names(df1),names(df2)),"1")
)
setnames(
dfmerge,
setdiff(names(df2),names(df1)),
paste0(setdiff(names(df2),names(df1)),"2")
)
colnames(dfmerge)
#[1] "a" "b1" "r1" "d1" "c2" "q2" "d2"
Run Code Online (Sandbox Code Playgroud)
mri*_*rip 11
简单的解决方案:
mrg<-(merge(df1,df2, by = 'a', suffixes = c("1","2")))
setnames(mrg,paste0(names(mrg),ifelse(names(mrg) %in% setdiff(names(df1),names(df2)),"1","")))
setnames(mrg,paste0(names(mrg),ifelse(names(mrg) %in% setdiff(names(df2),names(df1)),"2","")))
> names(mrg)
[1] "a" "b1" "d1" "d2"
Run Code Online (Sandbox Code Playgroud)
编辑:感谢里卡多·萨波塔(Ricardo Saporta)对大幅清理这一点的评论,并教给我一些新的提示!
请尝试以下操作:
colnames(
mergeWithSuffix(df1,df2, by = 'a', suffixes = c("1","2"))
)
[1] "a" "b.1" "d.1" "d.2"
Run Code Online (Sandbox Code Playgroud)
请注意,原件data.frames没有损坏。
colnames(df1)
[1] "a" "b" "d"
colnames(df2)
[1] "a" "d"
Run Code Online (Sandbox Code Playgroud)
功能如下
require(data.table)
mergeWithSuffix <- function(x, y, by, suffixes=NULL, ...) {
# Add Suffixes
mkSuffix(x, suffixes[[1]], merge.col=by)
mkSuffix(y, suffixes[[2]], merge.col=by)
# Merge
ret <- merge(x, y, by = by, suffixes = NULL, ...)
# Remove Suffixes
undoSuffix(x, suffixes[[1]], merge.col=by)
undoSuffix(y, suffixes[[2]], merge.col=by)
return(ret)
}
mkSuffix <- function(x, sfx, sep=".", merge.col=NULL) {
nms <- setdiff(names(x), merge.col)
setnames(x, nms, paste(nms, sfx, sep=".") )
}
undoSuffix <- function(x, sfx, sep=".", merge.col=NULL) {
nms <- setdiff(names(x), merge.col)
setnames(x, nms, sub(paste0(get("sep"), sfx, "$"), "", nms))
}
Run Code Online (Sandbox Code Playgroud)
注意,该方法setnames通过引用起作用,因此开销几乎可以忽略不计。而且,正如其他地方所讨论的,这在data.frames和data.table上同样有效
| 归档时间: |
|
| 查看次数: |
4881 次 |
| 最近记录: |