如何使用不会重叠的 ggplot2 和 R 将一系列坐标绘制为矩形?

jak*_*115 4 r ggplot2

我最近问了一个关于如何从一系列坐标制作矩形的问题,链接在这里

答案是完美的,让我可以很好地生成矩形:

# Sample data
plot.data <- data.frame(start.points=c(5, 32),
                        end.points=c(15, 51), 
                        text.label=c("Sample A", "Sample B"))
plot.data$text.position <- (plot.data$start.points + plot.data$end.points)/2

# Plot using ggplot
library(ggplot2)
p <- ggplot(plot.data)
p + geom_rect(aes(xmin=start.points, xmax=end.points, ymin=0, ymax=3), 
              fill="yellow") + 
  theme_bw() + geom_text(aes(x=text.position, y=1.5, label=text.label)) + 
  labs(x=NULL, y=NULL)
Run Code Online (Sandbox Code Playgroud)

然而,我意识到我的数据经常有重叠的坐标,我希望能够在不清除重叠的跨度的情况下可视化每个单独的跨度。所以,让我们以此作为示例数据集:2-3, 5-10, 7-10

当前代码将给出如下内容:

    ----    -----------------             
----|  |----|               |-------------
    ----    -----------------             
Run Code Online (Sandbox Code Playgroud)

但是,我想以某种方式更改代码,以便在新轨道上显示重叠数据:

    ----    -----------------             
----|  |----|               |-------------
    ----    -----------------             

                -------------             
----------------|           |---------
                -------------             
Run Code Online (Sandbox Code Playgroud)

抱歉愚蠢的 ASCII 艺术!

有人有建议吗?如果最简单的话,我不会反对独立生成多个图像然后堆叠它们。谢谢!

Rob*_*ski 5

您可以手动计算非重叠间隔的序列,并相应地将矩形隔开。这是intervals包裹:(请注意,我们假设您的积分是按顺序排列的start.points——这很容易做到)

library(intervals)
plot.data <- data.frame(start.points = c(1,2,4,6,8,11), end.points = c(3,5,9,10,12,13),
                        text.label = paste0('Sample ', LETTERS[1:6]))
plot.data$text.position <- (plot.data$start.points + plot.data$end.points)/2

overlap <- interval_overlap(tmp <- Intervals(c(plot.data$start.points, plot.data$end.points)), tmp)
# Find the next non-overlapping interval
nexts <- lapply(overlap, function(x) max(x) + 1)
non_overlaps <- list()
while(sum(sapply(nexts, Negate(is.na))) > 0) {
  consec <- c()
  i <- which(sapply(nexts, Negate(is.na)))[1]

  # Find a stretch of consecutive non-overlapping intervals
  while(!is.na(i) && i <= length(nexts) && !any(sapply(non_overlaps, function(y) i %in% y))) {
    consec <- c(consec, i); i <- nexts[[i]]
  }

  non_overlaps <- append(non_overlaps, list(consec))
  # Wipe out that stretch since we're no longer looking at it
  nexts[consec] <- NA
}

# Squash remaining non-overlapping intervals -- the packing is not yet compact
i <- 1
while (i < length(non_overlaps)) {
  ints1 <- non_overlaps[[i]]
  ints1 <- Intervals(c(plot.data$start.points[ints1], plot.data$end.points[ints1]))
  j <- i + 1
  while(j <= length(non_overlaps)) {
    ints2 <-  Intervals(c(plot.data$start.points[non_overlaps[[j]]],
                  plot.data$end.points[non_overlaps[[j]]]))
    iv <- interval_overlap(ints1, ints2)
    if (length(c(iv, recursive = TRUE)) == 0) break;
    j <- j + 1
  }

  if (j <= length(non_overlaps)) {
    # we can merge non_overlaps[[i]] and non_overlaps[[j]]
    non_overlaps[[i]] <- c(non_overlaps[[i]], non_overlaps[[j]])
    non_overlaps[[j]] <- NULL
  } else {
    # we are done non_overlaps[[i]] -- nothing else can be squashed!
    i <- i + 1
  }
}
Run Code Online (Sandbox Code Playgroud)

我们现在有

 print(non_overlaps)
 # [[1]]
 # [1] 1 3 6
 #
 # [[2]]
 # [1] 2 4 6
 #
 # [[3]]
 # [1] 5
Run Code Online (Sandbox Code Playgroud)

我们可以在不同的高度上绘制这些不重叠的间隔。

 ymin <- length(non_overlaps) - 1 - (sapply(seq_len(nrow(plot.data)),
    function(ix) which(sapply(non_overlaps, function(y) ix %in% y))) - 1)
 ymax <- ymin + 0.9
 text.position.y <- ymin + 0.45
 ymin <- ymin / length(non_overlaps) * 3 # rescale for display
 ymax <- ymax / length(non_overlaps) * 3 # rescale for display
 text.position.y <- text.position.y / length(non_overlaps) * 3

 library(ggplot2)
 p <- ggplot(plot.data)
 p + geom_rect(aes(xmin=start.points, xmax=end.points, ymin=ymin, ymax=ymax),
               fill="yellow") +
   theme_bw() + geom_text(aes(x=text.position, y=text.position.y, label=text.label)) +
   labs(x=NULL, y=NULL)
Run Code Online (Sandbox Code Playgroud)

最终结果:

在此处输入图片说明

还有一些例子:

plot.data <- data.frame(start.points = c(1,3,5,7,9,11,13), end.points = c(4,6,8,10,12,14, 16), text.label = paste0('Sample ', LETTERS[1:7]))
Run Code Online (Sandbox Code Playgroud)

在此处输入图片说明

plot.data <- data.frame(start.points = seq(1, 13, by = 4), end.points = seq(4, 16, by = 4), text.label = paste0('Sample ', LETTERS[1:4]))
Run Code Online (Sandbox Code Playgroud)

在此处输入图片说明

set.seed(100); plot.data <- data.frame(start.points = tmp <- sort(runif(26, 1, 15)), end.points = tmp + runif(26, 1, 3), text.label = paste0('Sample ', LETTERS))
Run Code Online (Sandbox Code Playgroud)

在此处输入图片说明

PS 我为鸡的划伤道歉,但我这样做相当仓促——我相信其中一些操作可以更巧妙地执行!