在R中制作特定的分位数图

Dre*_*rey 1 visualization r data-visualization ggplot2

我对下面的可视化(Decile术语)非常感兴趣

在此输入图像描述

我想知道如何在R中做到这一点.

当然有直方图和密度图,但它们没有做出如此好的可视化.特别是,我想知道是否可以用ggplot/ 来做tidyverse.

编辑以响应评论, library(dplyr) library(ggplot2) someData <- data_frame(x = rnorm(1000)) ggplot(someData, aes(x = x)) + geom_histogram() 这会生成一个直方图(参见http://www.r-fiddle.org/#/fiddle?id=LQXazwMY&version=1)

但我怎么能得到coloful酒吧?如何实现小矩形?(箭头不太相关).

Axe*_*man 6

您必须定义多个中断,并使用与直方图中断匹配的近似十进制.否则,两个十分位数将在一个栏中结束.

d <- data_frame(x = rnorm(1000))

breaks <- seq(min(d$x), max(d$x), length.out = 50)
quantiles <- quantile(d$x, seq(0, 1, 0.1))
quantiles2 <- sapply(quantiles, function(x) breaks[which.min(abs(x - breaks))])

d$bar <- as.numeric(as.character(cut(d$x, breaks, na.omit((breaks + dplyr::lag(breaks)) / 2))))
d$fill <- cut(d$x, quantiles2, na.omit((quantiles2 + dplyr::lag(quantiles2)) / 2))

ggplot(d, aes(bar, y = 1, fill = fill)) +
  geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1])
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

或者使用更多不同的颜色:

ggplot(d, aes(bar, y = 1, fill = fill)) +
  geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1]) +
  scale_fill_brewer(type = 'qual', palette = 3) # The only qual pallete with enough colors
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

添加一些样式并将中断增加到100:

ggplot(d, aes(bar, y = 1, fill = fill)) +
  geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1], size = 0.3) +
  scale_fill_brewer(type = 'qual', palette = 3) +
  theme_classic() +
  coord_fixed(diff(breaks)[1], expand = FALSE) + # makes square blocks
  labs(x = 'x', y = 'count')
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

这是最后一个函数:

decile_histogram <- function(data, var, n_breaks = 100) {
  breaks <- seq(min(data[[var]]), max(data[[var]]), length.out = n_breaks)
  quantiles <- quantile(data[[var]], seq(0, 1, 0.1))
  quantiles2 <- sapply(quantiles, function(x) breaks[which.min(abs(x - breaks))])

  data$bar <- as.numeric(as.character(
    cut(data[[var]], breaks, na.omit((breaks + dplyr::lag(breaks)) / 2)))
  )
  data$fill <- cut(data[[var]], quantiles2, na.omit((quantiles2 + dplyr::lag(quantiles2)) / 2))

  ggplot2::ggplot(data, ggplot2::aes(bar, y = 1, fill = fill)) +
    ggplot2::geom_col(position = 'stack', col = 1, show.legend = FALSE, width = diff(breaks)[1], size = 0.3) +
    ggplot2::scale_fill_brewer(type = 'qual', palette = 3) +
    ggplot2::theme_classic() +
    ggplot2::coord_fixed(diff(breaks)[1], expand = FALSE) +
    ggplot2::labs(x = 'x', y = 'count')
}
Run Code Online (Sandbox Code Playgroud)

用于:

d <- data.frame(x = rnorm(1000))
decile_histogram(d, 'x')
Run Code Online (Sandbox Code Playgroud)