我有两个长列表A和B,它们具有相同的长度,但包含不同数量的等效元素:
列表A可以包含许多元素,这些元素也可以在同一个字段中重复出现.
列表B或者只包含一个元素或一个空字段,即"字符(0)".
A还包含一些空字段但是对于这些记录,B中总是存在一个元素,因此A和B中没有空字段的记录.
我想将A和B的元素组合成一个相同长度的新列表, C,根据以下规则:
这是这些列表如何开始的示例:
> A
[1] "JAMES" "JAMES"
[2] "JOHN" "ROBERT"
[3] "WILLIAM" "MICHAEL" "WILLIAM" "DAVID" "WILLIAM"
[4] character(0)
...
> B
[1] "RICHARD"
[2] "JOHN"
[3] character(0)
[4] "CHARLES"
...
Run Code Online (Sandbox Code Playgroud)
这是我正在寻找的正确输出:
> C
[1] "JAMES" "JAMES" "RICHARD"
[2] "JOHN" "ROBERT"
[3] "WILLIAM" "MICHAEL" "WILLIAM" "DAVID" "WILLIAM"
[4] "CHARLES"
...
Run Code Online (Sandbox Code Playgroud)
我试过,例如:
C <- sapply(mapply(union, A,B), setdiff, character(0))
Run Code Online (Sandbox Code Playgroud)
但不幸的是,这删除了A的复发:
> C
[1] "JAMES" "RICHARD"
[2] "JOHN" "ROBERT"
[3] "WILLIAM" "MICHAEL" "DAVID"
[4] "CHARLES"
...
Run Code Online (Sandbox Code Playgroud)
有人可以告诉我,如何结合这两个列表,保留A的重复,并实现我想要的输出?
非常感谢你提前!
更新:机读数据:
A <- list(c("JAMES","JAMES"),
c("JOHN","ROBERT"),
c("WILLIAM","MICHAEL","WILLIAM","DAVID","WILLIAM"),
character(0))
B <- list("RICHARD","JOHN",character(0),"CHARLES")
Run Code Online (Sandbox Code Playgroud)
以下是您可以重现的数据:
A <- list(c("JAMES","JAMES"),
c("JOHN","ROBERT"),
c("WILLIAM","MICHAEL","WILLIAM","DAVID","WILLIAM"),
character(0))
B <- list("RICHARD","JOHN",character(0),"CHARLES")
Run Code Online (Sandbox Code Playgroud)
你很亲密mapply().我通过使用c()连接列表元素得到了所需的输出A,B但必须操纵提供的向量的元素,所以我想出了这个:
foo <- function(...) {
l1 <- length(..1)
l2 <- length(..2)
out <- character(0)
if(l1 > 0) {
if(l2 > 0) {
out <- if(..2 %in% ..1)
..1
else
c(..1, ..2)
} else {
out <- ..1
}
} else {
out <- ..2
}
out
}
Run Code Online (Sandbox Code Playgroud)
我们可以参考...使用..n占位符的各个元素; ..1是A和..2是B.当然,foo()只能使用两个列表,但不强制执行此操作或进行任何检查,只是为了简单起见.foo()还需要处理的情况下,无论是A或B或两个都是character(0)我现在觉得foo()做.
当我们在mapply()通话中使用它时,我得到:
> mapply(foo, A, B)
[[1]]
[1] "JAMES" "JAMES" "RICHARD"
[[2]]
[1] "JOHN" "ROBERT"
[[3]]
[1] "WILLIAM" "MICHAEL" "WILLIAM" "DAVID" "WILLIAM"
[[4]]
[1] "CHARLES"
Run Code Online (Sandbox Code Playgroud)
一个lapply()版本可能比抽象更有意义..n,但本质上使用相同的代码.下面是与工作的新功能A和B直接,但我们遍历的元素的索引A(1, 2, 3, length(A))通过所产生seq_along():
foo2 <- function(ind, A, B) {
l1 <- length(A[[ind]])
l2 <- length(B[[ind]])
out <- character(0)
if(l1 > 0) {
if(l2 > 0) {
out <- if(B[[ind]] %in% A[[ind]]) {
A[[ind]]
} else {
c(A[[ind]], B[[ind]])
}
} else {
out <- A[[ind]]
}
} else {
out <- B[[ind]]
}
out
}
Run Code Online (Sandbox Code Playgroud)
这被称为:
> lapply(seq_along(A), foo2, A = A, B = B)
[[1]]
[1] "JAMES" "JAMES" "RICHARD"
[[2]]
[1] "JOHN" "ROBERT"
[[3]]
[1] "WILLIAM" "MICHAEL" "WILLIAM" "DAVID" "WILLIAM"
[[4]]
[1] "CHARLES"
Run Code Online (Sandbox Code Playgroud)