我有一个数据集,我试图通过计算一个类别来选择前n个,但随后使用数据集中的其他变量进行绘图 - 基本上是前n个的一个聚合级别,但需要返回到完整数据到情节ggplot.
所以在下面的问题中,我想要两个最常见的examNames,然后facetwrap按照计数绘制它们year.
ap <-
tribble(
~year, ~examName,
2014, "Statistics",
2015, "Statistics",
2016, "Statistics",
2016, "Statistics",
2016, "Statistics",
2016, "Statistics",
2017, "Statistics",
2017, "Statistics",
2017, "Statistics",
2017, "Statistics",
2017, "Statistics",
2013, "Macroeconomics",
2013, "Macroeconomics",
2014, "Macroeconomics",
2015, "Macroeconomics",
2016, "Macroeconomics",
2016, "Macroeconomics",
2016, "Macroeconomics",
2016, "Macroeconomics",
2016, "Macroeconomics",
2017, "Macroeconomics",
2017, "Macroeconomics",
2017, "Macroeconomics",
2017, "Macroeconomics",
2017, "Macroeconomics",
2017, "Macroeconomics",
2013, "Calculus",
2014, "Calculus",
2015, "Calculus",
2016, "Calculus",
2017, "Calculus",
2017, "Psychology",
2017, "Psychology",
2017, "Psychology",
2017, "Psychology",
2017, "Psychology",
2018, "Psychology",
2018, "Psychology")
ap_top <- ap %>%
count(examName, sort = TRUE) %>%
head(2) %>%
inner_join(ap, by = "examName") %>%
select(-n)
ap_top %>%
count(examName, year) %>%
ggplot(aes(x = year, y = n, group = examName)) +
geom_line() +
facet_wrap(~ examName)
Run Code Online (Sandbox Code Playgroud)
我的想法是获得我的前n个,然后inner_join回到原始数据集.然后使用那个绘图; 基本上使用内部联接作为过滤器.
我知道有更好的方法来做到这一点,我希望有一个更优雅的解决方案!我全都耳朵!给出的示例数据集(对不起,这么久).
你不需要inner_join()我只是在一个单独的声明中确定前两个考试然后过滤那些.
top_exams <- count(ap, examName) %>%
top_n(2, n) %>% pull(examName)
ap %>%
filter(examName %in% top_exams) %>%
count(year, examName) %>%
ggplot(aes(x = year, y = n, group = examName)) +
geom_line() +
facet_wrap(~ examName)
Run Code Online (Sandbox Code Playgroud)