use*_*655 5 plot visualization r
我想在R 中创建人口金字塔。我知道 StackOverflow 上有很多例子,但我想创建一个还包括人口预测的例子,即每个年龄组的条形按性别和按性别和年龄组划分的线条用于预测。
你可以在这里看到一个例子:http : //geographyblog.eu/wp/the-worlds-population-pyramid-is-changed-shape/

如果有一些关于如何更好地说明这一点的建议(例如使用平滑线),也欢迎提出,但我想指出当前的情况和预测。示例数据可在联合国网站上找到:http : //esa.un.org/wpp/population-pyramids/population-pyramids_absolute.htm
任何帮助将不胜感激。
也许少一点特别的方法使用ggplot2andgeom_bar和geom_step。
可以从wpp2015包中提取数据(或者wpp2012,wpp2010或者wpp2008如果您更喜欢旧版本)。
library("dplyr")
library("tidyr")
library("wpp2015")
#load data in wpp2015
data(popF)
data(popM)
data(popFprojMed)
data(popMprojMed)
#combine past and future female population
df0 <- popF %>%
left_join(popFprojMed) %>%
mutate(gender = "female")
#combine past and future male population, add on female population
df1 <- popM %>%
left_join(popMprojMed) %>%
mutate(gender = "male") %>%
bind_rows(df0) %>%
mutate(age = factor(age, levels = unique(age)))
#stack data for ggplot, filter World population and required years
df2 <- df1 %>%
gather(key = year, value = pop, -country, -country_code, -age, -gender) %>%
mutate(pop = pop/1e03) %>%
filter(country == "World", year %in% c(1950, 2000, 2050, 2100))
#add extra rows and numeric age variable for geom_step used for 2100
df2 <- df2 %>%
mutate(ageno = as.numeric(age) - 0.5)
df2 <- df2 %>%
bind_rows(df2 %>% filter(year==2100, age=="100+") %>% mutate(ageno = ageno + 1))
df2
# Source: local data frame [170 x 7]
#
# country country_code age gender year pop ageno
# (fctr) (int) (fctr) (chr) (chr) (dbl) (dbl)
# 1 World 900 0-4 male 1950 171.85124 0.5
# 2 World 900 5-9 male 1950 137.99242 1.5
# 3 World 900 10-14 male 1950 133.27428 2.5
# 4 World 900 15-19 male 1950 121.69274 3.5
# 5 World 900 20-24 male 1950 112.39438 4.5
# 6 World 900 25-29 male 1950 96.59408 5.5
# 7 World 900 30-34 male 1950 83.38595 6.5
# 8 World 900 35-39 male 1950 80.59671 7.5
# 9 World 900 40-44 male 1950 73.08263 8.5
# 10 World 900 45-49 male 1950 63.13648 9.5
# .. ... ... ... ... ... ... ...
Run Code Online (Sandbox Code Playgroud)
使用标准ggplot功能,您可以获得类似的东西,从这里的答案中进行调整:
library("ggplot2")
ggplot(data = df2, aes(x = age, y = pop, fill = year)) +
#bars for all but 2100
geom_bar(data = df2 %>% filter(gender == "female", year != 2100) %>% arrange(rev(year)),
stat = "identity",
position = "identity") +
geom_bar(data = df2 %>% filter(gender == "male", year != 2100) %>% arrange(rev(year)),
stat = "identity",
position = "identity",
mapping = aes(y = -pop)) +
#steps for 2100
geom_step(data = df2 %>% filter(gender == "female", year == 2100),
aes(x = ageno)) +
geom_step(data = df2 %>% filter(gender == "male", year == 2100),
aes(x = ageno, y = -pop)) +
coord_flip() +
scale_y_continuous(labels = abs)
Run Code Online (Sandbox Code Playgroud)
注意:您需要这样做,arrange(rev(year))因为条形图是叠加层。
使用该ggthemes软件包,您可以非常接近原始的经济学人图。
library("ggthemes")
ggplot(data = df2, aes(x = age, y = pop, fill = year)) +
#bars for all but 2100
geom_bar(data = df2 %>% filter(gender == "female", year != 2100) %>% arrange(rev(year)),
stat = "identity",
position = "identity") +
geom_bar(data = df2 %>% filter(gender == "male", year != 2100) %>% arrange(rev(year)),
stat = "identity",
position = "identity",
mapping = aes(y = -pop)) +
#steps for 2100
geom_step(data = df2 %>% filter(gender == "female", year == 2100),
aes(x = ageno), size = 1) +
geom_step(data = df2 %>% filter(gender == "male", year == 2100),
aes(x = ageno, y = -pop), size = 1) +
coord_flip() +
#extra style shazzaz
scale_y_continuous(labels = abs, limits = c(-400, 400), breaks = seq(-400, 400, 100)) +
geom_hline(yintercept = 0) +
theme_economist(horizontal = FALSE) +
scale_fill_economist() +
labs(fill = "", x = "", y = "")
Run Code Online (Sandbox Code Playgroud)
(我相信你可以走得更近,但我现在已经停在这里了)。