如何总结组合列表

kab*_*cha 8 r cluster-analysis list

我有一个包含 2 个元素组合的列表,如下所示。

cbnl <- list(
  c("A", "B"), c("B", "A"), c("C", "D"), c("E", "D"), c("F", "G"), c("H", "I"),
  c("J", "K"), c("I", "H"), c("K", "J"), c("G", "F"), c("D", "C"), c("E", "C"),
  c("D", "E"), c("C", "E")
)
Run Code Online (Sandbox Code Playgroud)

我想总结一下上面的列表。预期结果如下表所示。向量中元素的顺序在这里并不重要。

[[1]]
[1] "A" "B"

[[2]]
[1] "C" "D" "E"

[[3]]
[1] "F" "G"

[[4]]
[1] "H" "I"

[[5]]
[1] "J" "K"
Run Code Online (Sandbox Code Playgroud)

(规则1){A,B}等价于{B,A}。为了对应这一点,我想我可以做到这一点。

cbnl <- unique(lapply(cbnl, function(i) { sort(i) }))
Run Code Online (Sandbox Code Playgroud)

(规则2){A,B},{B,C}(其中一个元素是公共的)然后取两个集合的并集。结果是{A,B,C}。我没有明确的好主意来做到这一点。

有什么有效的方法可以做到这一点吗?

may*_*din 7

我知道这个答案更像是传统编程而不是“类似 R”,但它解决了问题。

cbnl <- unique(lapply(cbnl, sort))

i <- 1
count <- 1
out <- list()

while (i <= length(cbnl) - 1) {
  if (sum(cbnl[[i]] %in% cbnl[[i + 1]]) == 0) {
    out[[count]] <- cbnl[[i]]
    } else {
      out[[count]] <- sort(unique(c(cbnl[[i]], cbnl[[i + 1]])))
      i <- i + 1        
    }
  count <- count + 1
  i <- i + 1 
}

out
Run Code Online (Sandbox Code Playgroud)

给出,

[[1]]
[1] "A" "B"

[[2]]
[1] "C" "D" "E"

[[3]]
[1] "F" "G"

[[4]]
[1] "H" "I"

[[5]]
[1] "J" "K"
Run Code Online (Sandbox Code Playgroud)


Tho*_*ing 6

您可以尝试以下igraph选项

library(igraph)

graph_from_data_frame(do.call(rbind, cbnl)) %>%
  components() %>%
  membership() %>%
  stack() %>%
  with(., split(as.character(ind), values))
Run Code Online (Sandbox Code Playgroud)

这使

$`1`
[1] "A" "B"

$`2`
[1] "C" "E" "D"

$`3`
[1] "F" "G"

$`4`
[1] "H" "I"

$`5`
[1] "J" "K"
Run Code Online (Sandbox Code Playgroud)

较短的一个

graph_from_data_frame(do.call(rbind, cbnl)) %>%
  decompose() %>%
  Map(function(x) names(V(x)), .)
Run Code Online (Sandbox Code Playgroud)

这使

[[1]]
[1] "A" "B"

[[2]]
[1] "C" "E" "D"

[[3]]
[1] "F" "G"

[[4]]
[1] "H" "I"

[[5]]
[1] "J" "K"
Run Code Online (Sandbox Code Playgroud)


jay*_*.sf 5

基 R:如 中那样sort进行 ing ,然后根据唯一元素部分填充矩阵并删除行,最后强制。unionFUN=combnuduplicatedas.list

u <- Reduce(union, cbnl)  ## get unique elements

res <- combn(cbnl, 2, \(x) {
  if (length(intersect(x[[1]], x[[2]])) > 0) {
    union(x[[1]], x[[2]])
  } else {
    el(x)
  }
}, simplify=FALSE) |>
  unique() |>
  (\(x) sapply(x, \(i) replace(rep(NA, length(u)), match(i, u), i)))() |>
  (\(x) x[, !colSums(duplicated(x, MARGIN=1:2)) == nrow(x)])() |>
  (\(x) unname(lapply(as.list(as.data.frame(x)), \(x) x[!is.na(x)])))()

res
# [[1]]
# [1] "A" "B"
# 
# [[2]]
# [1] "C" "D" "E"
# 
# [[3]]
# [1] "F" "G"
# 
# [[4]]
# [1] "H" "I"
# 
# [[5]]
# [1] "J" "K"
Run Code Online (Sandbox Code Playgroud)

笔记:

> R.version.string
[1] "R version 4.1.2 (2021-11-01)"
Run Code Online (Sandbox Code Playgroud)