我有以下数据帧:
gender age population
H 0-4 5
H 5-9 5
H 10-14 10
H 15-19 15
H 20-24 15
H 25-29 10
M 0-4 0
M 5-9 5
M 10-14 5
M 15-19 15
M 20-24 10
M 25-29 15
Run Code Online (Sandbox Code Playgroud)
我需要在以下数据框中重新分组年龄类别:
gender age population
H 0-14 20
H 15-19 15
H 20-29 25
M 0-14 10
M 15-19 15
M 20-29 25
Run Code Online (Sandbox Code Playgroud)
我喜欢dplyr,所以如果有办法用这个包完成这个,我很感激.
使用字符串拆分 - tidyr::separate()和cut():
library(dplyr)
library(tidyr)
df1 %>%
separate(age, into = c("age1", "age2"), sep = "-", convert = TRUE ) %>%
mutate(age = cut(age1,
breaks = c(0, 14, 19, 29),
labels = c("0-14", "15-19", "20-29"),
include.lowest = TRUE)) %>%
group_by(gender, age) %>%
summarise(population = sum(population))
# output
# gender age population
# (fctr) (fctr) (int)
# 1 H 0-14 20
# 2 H 15-19 15
# 3 H 20-29 25
# 4 M 0-14 10
# 5 M 15-19 15
# 6 M 20-29 25
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
76 次 |
| 最近记录: |