如何合并2个相似的数据框,但有一个更重要的数据框?
例如:
数据帧1
Date Col1 Col2
jan 2 1
feb 4 2
march 6 3
april 8 NA
Run Code Online (Sandbox Code Playgroud)
数据帧2
Date Col2 Col3
jan 9 10
feb 8 20
march 7 30
april 6 40
Run Code Online (Sandbox Code Playgroud)
将这些按日期合并,数据框1优先,但数据框2填充空白
DataframeMerge
Date Col1 Col2 Col3
jan 2 1 10
feb 4 2 20
march 6 3 30
april 8 6 40
Run Code Online (Sandbox Code Playgroud)
编辑 - 解决方案
commonNames <- names(df1)[which(colnames(df1) %in% colnames(df2))]
commonNames <- commonNames[commonNames != "key"]
dfmerge<- merge(df1,df2,by="key",all=T)
for(i in commonNames){
left <- paste(i, ".x", sep="")
right <- paste(i, ".y", sep="")
dfmerge[is.na(dfmerge[left]),left] <- dfmerge[is.na(dfmerge[left]),right]
dfmerge[right]<- NULL
colnames(dfmerge)[colnames(dfmerge) == left] <- i
}
Run Code Online (Sandbox Code Playgroud)
42-*_*42- 13
merdat <- merge(dfrm1,dfrm2, by="Date") # seems self-documenting
# explanation for next line in text below.
merdat$Col2.y[ is.na(merdat$Col2.y) ] <- merdat$Col2.x[ is.na(merdat$Col2.y) ]
Run Code Online (Sandbox Code Playgroud)
然后只需将'merdat $ Col2.y'重命名为'merdat $ Col2'并删除'merdat $ Col2.x'.
在回复请求更多注释时:仅更新向量的各个部分的一种方法是构造用于索引的逻辑向量,并使用"["将其应用于赋值的两侧.另一种方法是设计一个逻辑向量,该逻辑向量仅在赋值的LHS上,但随后使用rep()具有相同长度的向量sum(logical.vector).目标是两个实例都具有与被替换项目相同的长度(和顺序).
Aru*_*run 10
使用data.table的on=参数v1.9.6更新(允许adhoc连接:
setDT(df1)[df2, `:=`(Col2 = ifelse(is.na(Col2), i.Col2, Col2),
Col3 = i.Col3), on="Date"][]
Run Code Online (Sandbox Code Playgroud)
这是一个data.table解决方案.确保您df1和df2的Date列因素与预期水平(订货)
require(data.table)
dt1 <- data.table(df1, key="Date")
dt2 <- data.table(df2, key="Date")
# Col2 refers to the Col2 of dt1 and i.col2 refers to that of dt2
dt1[dt2, `:=`(Col3 = Col3, Col1 = Col1,
Col2 = ifelse(is.na(Col2), i.Col2, Col2))]
# the result is stored in dt1
> dt1
# Date Col1 Col2 Col3
# 1: jan 2 1 10
# 2: feb 4 2 20
# 3: march 6 3 30
# 4: april 8 6 40
Run Code Online (Sandbox Code Playgroud)
这是一个dplyr解决方案.感谢@docendo discimus
df1 <- data.frame(y = c("A", "B", "C", "D"), x1 = c(1,2,NA, 4))
y x1
1 A 1
2 B 2
3 C NA
4 D 4
df2 <- data.frame(y = c("A", "B", "C"), x1 = c(5, 6, 7))
y x1
1 A 5
2 B 6
3 C 7
Run Code Online (Sandbox Code Playgroud)
dplyr
left_join(df1, df2, by="y") %>%
transmute(y, x1 = ifelse(is.na(x1.y), x1.x, x1.y))
y x1
1 A 5
2 B 6
3 C 7
Run Code Online (Sandbox Code Playgroud)
考虑这个例子:
> d1 <- data.frame(x=1:4, a=2:5, b=c(3,4,5,NA))
> d1
x a b
1 1 2 3
2 2 3 4
3 3 4 5
4 4 5 NA
> d2 <- data.frame(x=1:4, b=c(6,7,8,9), c=11:14)
> d2
x b c
1 1 6 11
2 2 7 12
3 3 8 13
4 4 9 14
Run Code Online (Sandbox Code Playgroud)
现在使用merge和within,用ifelse:
> within(merge(d1, d2, by="x"), {b <- ifelse(is.na(b.x),b.y,b.x); b.x <- NULL; b.y <- NULL})
x a c b
1 1 2 11 3
2 2 3 12 4
3 3 4 13 5
4 4 5 14 9
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
12938 次 |
| 最近记录: |