gol*_*ine 4 merge r compound-key dataframe
假设我有以下数据帧:
DF1 <- data.frame("A" = rep(c("A","B"), 18),
"B" = rep(c("C","D","E"), 12),
"NUM"= rep(rnorm(36,10,1)),
"TEST" = rep(NA,36))
DF2 <- data.frame("A" = rep("A",6),
"B" = rep(c("C","D"),6),
"VAL" = rep(c(1,3),3))
Run Code Online (Sandbox Code Playgroud)
*注意:变量A和Bin中的每个唯一组合DF2应该具有唯一性VAL.
对于每一行,我想,以取代NA在TEST与相应价值VAL的DF1,如果在列中的值A,并A匹配和列中的值B和B该行的比赛.否则,我会离开TEST的NA.如果不使用匹配循环每个组合,我将如何做到这一点?
理想情况下,答案将扩展到两个数据帧,其中有许多列要匹配.
RHA*_*RHA 10
# this is your DF1
DF1 <- data.frame("A" = rep(c("A","B"), 18),
"B" = rep(c("C","D","E"), 12),
"NUM"= rep(rnorm(36,10,1)),
"TEST" = rep(NA,36))
#this is a DF2 i created, with unique A, B, VAL
DF2 <- data.frame("A" = rep(c("A","B"),3),
"B" = rep(c("C","D","E"),2),
"VAL" = rep(1:6))
# and this is the answer of what i assume you want
tmp <- merge(DF1,DF2, by=c("A","B"), all.x=TRUE, all.y=FALSE)
DF1[4] <- tmp[5]
Run Code Online (Sandbox Code Playgroud)
正如Akrun在评论中提到的,您的查找表(DF2)需要简化为其唯一的A/B组合.对于您当前的数据框架,这不是问题,但如果同一组合有多个可能的值,则需要其他规则.从那里,解决方案很简单:
DF2.u <- unique(DF2)
DF3 <- merge(DF1, DF2.u, all = T)
Run Code Online (Sandbox Code Playgroud)
请注意,这将生成一个具有空TEST列(所有值NA)的新数据帧,以及从DF2分配的VAL列.要做到你想要的(尽可能用VAL替换TEST),这里有一些稍微笨重的代码:
DF1$TEST <- merge(DF1, DF2.u, all = T)$VAL
Run Code Online (Sandbox Code Playgroud)
编辑:回答你的问题,你可以在必要时将DF2归结为非常简单:
DF2$C <- c(1:12) #now unique() won't work
DF2.u <- unique(DF2[1:3])
A B VAL
1 A C 1
2 A D 3
Run Code Online (Sandbox Code Playgroud)