use*_*655 5 plot visualization r
我想在R 中创建人口金字塔。我知道 StackOverflow 上有很多例子,但我想创建一个还包括人口预测的例子,即每个年龄组的条形按性别和按性别和年龄组划分的线条用于预测。
你可以在这里看到一个例子:http : //geographyblog.eu/wp/the-worlds-population-pyramid-is-changed-shape/

如果有一些关于如何更好地说明这一点的建议(例如使用平滑线),也欢迎提出,但我想指出当前的情况和预测。示例数据可在联合国网站上找到:http : //esa.un.org/wpp/population-pyramids/population-pyramids_absolute.htm
任何帮助将不胜感激。
也许少一点特别的方法使用ggplot2andgeom_bar和geom_step。
可以从wpp2015包中提取数据(或者wpp2012,wpp2010或者wpp2008如果您更喜欢旧版本)。
library("dplyr")
library("tidyr")
library("wpp2015")
#load data in wpp2015
data(popF)
data(popM)
data(popFprojMed)
data(popMprojMed)
#combine past and future female population
df0 <- popF %>% 
  left_join(popFprojMed) %>%
  mutate(gender = "female")
#combine past and future male population, add on female population
df1 <- popM %>% 
  left_join(popMprojMed) %>%
  mutate(gender = "male") %>%
  bind_rows(df0) %>%
  mutate(age = factor(age, levels = unique(age)))
#stack data for ggplot, filter World population and required years
df2 <- df1 %>%
  gather(key = year, value = pop, -country, -country_code, -age, -gender) %>%
  mutate(pop = pop/1e03) %>%
  filter(country == "World", year %in% c(1950, 2000, 2050, 2100))
#add extra rows and numeric age variable for geom_step used for 2100
df2 <- df2 %>% 
  mutate(ageno = as.numeric(age) - 0.5)
df2 <- df2 %>%
  bind_rows(df2 %>% filter(year==2100, age=="100+") %>% mutate(ageno = ageno + 1)) 
df2 
# Source: local data frame [170 x 7]
# 
#    country country_code    age gender  year       pop ageno
#     (fctr)        (int) (fctr)  (chr) (chr)     (dbl) (dbl)
# 1    World          900    0-4   male  1950 171.85124   0.5
# 2    World          900    5-9   male  1950 137.99242   1.5
# 3    World          900  10-14   male  1950 133.27428   2.5
# 4    World          900  15-19   male  1950 121.69274   3.5
# 5    World          900  20-24   male  1950 112.39438   4.5
# 6    World          900  25-29   male  1950  96.59408   5.5
# 7    World          900  30-34   male  1950  83.38595   6.5
# 8    World          900  35-39   male  1950  80.59671   7.5
# 9    World          900  40-44   male  1950  73.08263   8.5
# 10   World          900  45-49   male  1950  63.13648   9.5
# ..     ...          ...    ...    ...   ...       ...   ...
Run Code Online (Sandbox Code Playgroud)
使用标准ggplot功能,您可以获得类似的东西,从这里的答案中进行调整:
library("ggplot2")
ggplot(data = df2, aes(x = age, y = pop, fill = year)) +
  #bars for all but 2100
  geom_bar(data = df2 %>% filter(gender == "female", year != 2100) %>% arrange(rev(year)),
           stat = "identity",
           position = "identity") +
  geom_bar(data = df2 %>% filter(gender == "male", year != 2100) %>% arrange(rev(year)),
           stat = "identity",
           position = "identity",
           mapping = aes(y = -pop)) +
  #steps for 2100
  geom_step(data =  df2 %>% filter(gender == "female", year == 2100), 
            aes(x = ageno)) +
  geom_step(data =  df2 %>% filter(gender == "male", year == 2100), 
            aes(x = ageno, y = -pop)) +
  coord_flip() +
  scale_y_continuous(labels = abs)
Run Code Online (Sandbox Code Playgroud)
注意:您需要这样做,arrange(rev(year))因为条形图是叠加层。
使用该ggthemes软件包,您可以非常接近原始的经济学人图。
library("ggthemes") 
ggplot(data = df2, aes(x = age, y = pop, fill = year)) +
  #bars for all but 2100
  geom_bar(data = df2 %>% filter(gender == "female", year != 2100) %>% arrange(rev(year)),
           stat = "identity",
           position = "identity") +
  geom_bar(data = df2 %>% filter(gender == "male", year != 2100) %>% arrange(rev(year)),
           stat = "identity",
           position = "identity",
           mapping = aes(y = -pop)) +
  #steps for 2100
  geom_step(data =  df2 %>% filter(gender == "female", year == 2100), 
        aes(x = ageno), size = 1) +
  geom_step(data =  df2 %>% filter(gender == "male", year == 2100), 
        aes(x = ageno, y = -pop), size = 1) +
  coord_flip() +
  #extra style shazzaz
  scale_y_continuous(labels = abs, limits = c(-400, 400), breaks = seq(-400, 400, 100)) +
  geom_hline(yintercept = 0) +
  theme_economist(horizontal = FALSE) +
  scale_fill_economist() +
  labs(fill = "", x = "", y = "")
Run Code Online (Sandbox Code Playgroud)
(我相信你可以走得更近,但我现在已经停在这里了)。