Don*_*beo 4 plot r ggplot2 stata
我有这个数据集:
> head(xc)
wheeze3 SmokingGroup_Kai TG2000 TG2012 PA_Score asthma3 tres3 age3 bmi bmi3
1 0 1 2 2 2 0 0 47 20.861 21.88708
2 0 5 2 3 3 0 0 57 20.449 23.05175
3 0 1 2 3 2 0 0 45 25.728 26.06168
4 0 2 1 1 3 0 0 48 22.039 23.50780
5 1 4 2 2 1 0 1 61 25.391 25.63692
6 0 4 2 2 2 0 0 54 21.633 23.66144
education3 group_change
1 2 0
2 2 3
3 3 3
4 3 0
5 1 0
6 2 0
Run Code Online (Sandbox Code Playgroud)
这
asthma3是一个取值为0,1的变量;
group_change取值0,1,2,3,4,5,6;
age3 代表年龄.
我想绘制具有asthma3==1变量函数的人的百分比age3.我想在相同的地块上划分6行,将样本除以group_change.
我认为应该可以使用ggplot2.
这是一个ggplot2方法:
library(ggplot2)
library(dplyr)
# Create fake data
set.seed(10)
xc=data.frame(age3=sample(40:50, 500, replace=TRUE),
asthma3=sample(0:1,500, replace=TRUE),
group_change=sample(0:6, 500, replace=TRUE))
# Summarize asthma percent by group_change and age3 (using dplyr)
xc1 = xc %.%
group_by(group_change, age3) %.%
summarize(asthma.pct=mean(asthma3)*100)
# Plot using ggplot2
ggplot(xc1, aes(x=age3, y=asthma.pct, colour=as.factor(group_change))) +
geom_line() +
geom_point() +
scale_x_continuous(breaks=40:50) +
xlab("Age") + ylab("Asthma Percent") +
scale_colour_discrete(name="Group Change")
Run Code Online (Sandbox Code Playgroud)
这是另一种直接与原始数据框一起使用的ggplot2方法,可以动态计算百分比.我还以百分比格式格式化了y轴.
library(scales) # Need this for "percent_format()"
ggplot(xc, aes(x=age3, y=asthma3, colour=as.factor(group_change))) +
stat_summary(fun.y=mean, geom='line') +
stat_summary(fun.y=mean, geom='point') +
scale_x_continuous(breaks=40:50) +
scale_y_continuous(labels=percent_format()) +
xlab("Age") + ylab("Asthma Percent") +
scale_colour_discrete(name="Group Change")
Run Code Online (Sandbox Code Playgroud)