桑基/冲积图,其中百分比和部分填充为 R

cap*_*oma 4 r dataflow ggplot2 sankey-diagram

我想使用ggplot2和修改现有的桑基图ggalluvial,使其更具吸引力

我的例子来自https://corybrunson.github.io/ggalluvial/articles/ggalluvial.html

library(ggplot2)
library(ggalluvial)

data(vaccinations)
levels(vaccinations$response) <- rev(levels(vaccinations$response))
ggplot(vaccinations,
       aes(x = survey, stratum = response, alluvium = subject,
           y = freq,
           fill = response, label = response)) +
  scale_x_discrete(expand = c(.1, .1)) +
  geom_flow() +
  geom_stratum(alpha = .5) +
  geom_text(stat = "stratum", size = 3) +
  theme(legend.position = "none") +
  ggtitle("vaccination survey responses at three points in time")
Run Code Online (Sandbox Code Playgroud)

由reprex 包于 2020 年 10 月 1 日创建(v0.3.0)

现在,我想更改此图,它看起来类似于https://sciolisticramblings.wordpress.com/2018/11/23/sankey-charts-the-new-pie-chart/中的图,即 1. 更改绝对值到相对值(百分比) 2. 添加百分比标签并 3. 应用部分填充(例如“缺失”和“从不”) 在此输入图像描述

我的方法: 我想我可以将轴更改为百分比,例如:scale_y_continuous(label = scales::percent_format(scale = 100)) 但是,我不确定步骤 2. 和 3.。

ste*_*fan 5

这可以像这样实现:

  1. 可以通过向 df 添加一个新列来实现更改为百分比,其中包含调查的百分比份额,然后可以将其映射到y而不是freq

  2. 要获得漂亮的百分比标签,您可以使用scale_y_continuous(label = scales::percent_format())

  3. 对于部分填充,您可以映射例如response %in% c("Missing", "Never")fill给出TRUE“丢失”和“从不”)并通过设置填充颜色scale_fill_manual

  4. 每个层的百分比可以通过我使用变量并通过计算来添加到标签label = paste0(..stratum.., "\n", scales::percent(..count.., accuracy = .1))中。geom_text..stratum....count..stat_stratum

library(ggplot2)
library(ggalluvial)
library(dplyr)

data(vaccinations)
levels(vaccinations$response) <- rev(levels(vaccinations$response))

vaccinations <- vaccinations %>% 
  group_by(survey) %>% 
  mutate(pct = freq / sum(freq))

ggplot(vaccinations,
       aes(x = survey, stratum = response, alluvium = subject,
           y = pct,
           fill = response %in% c("Missing", "Never"), 
           label = response)) +
  scale_x_discrete(expand = c(.1, .1)) +
  scale_y_continuous(label = scales::percent_format()) +
  scale_fill_manual(values = c(`TRUE` = "cadetblue1", `FALSE` = "grey50")) +
  geom_flow() +
  geom_stratum(alpha = .5) +
  geom_text(aes(label = paste0(..stratum.., "\n", scales::percent(..count.., accuracy = .1))), stat = "stratum", size = 3) +
  theme(legend.position = "none") +
  ggtitle("vaccination survey responses at three points in time")
Run Code Online (Sandbox Code Playgroud)