我正在尝试使用 ggplot 绘制数据框,该数据框看起来像http://www.ats.ucla.edu/stat/r/dae/logit.htm底部的图。
a<-data.frame(Year=c("2012","2012","2012","2013","2013","2013","2014","2014","2014"),
Engagement=rep(c("low","med","high"),3),
cost=c(4464.88,4690.14,4342.72,5326.63,5000.03,3967.02,4646.27,4282.38,3607.79),
lower=c(4151.4,5027.51,4095.73,4366.82,4682.85,3715.86,3775.25,3642.41,3235.43),
upper=c(4778.35,5625.75,5196.81,5013.45,5317.2,4848.89,4910.19,4291.64,3980.14))
Run Code Online (Sandbox Code Playgroud)
我试过:
k<-ggplot(a,aes(x=Year,y=cost))
k+geom_ribbon(aes(ymin=lower,ymax=upper,fill=Engagement),alpha=0.2)+
geom_pointrange(aes(x=Year,y=cost,ymin=lower,ymax=lower),size=1,width=0.2,color="blue")
Run Code Online (Sandbox Code Playgroud)
我感谢所有的帮助。
我刚刚也尝试过:
pd <- position_dodge(0.1)
k<-ggplot(a,aes(x=Year,y=cost))
k+geom_ribbon(aes(ymin=lower,ymax=upper,fill=Engagement),alpha=0.2)+
geom_line(position=pd,aes(color=Engagement))
Run Code Online (Sandbox Code Playgroud)
错误信息:
ymax not defined: adjusting position using y instead
geom_path: Each group consist of only one observation.
Do you need to adjust the group aesthetic?
Run Code Online (Sandbox Code Playgroud) 我把数据集分成了火车并测试如下:
splitdata<-split(sb[1:nrow(sb),], sample(rep(1:2, as.integer(nrow(sb)/2))))
test<-splitdata[[1]]
train<-rbind(splitdata[[2]])
Run Code Online (Sandbox Code Playgroud)
sb是原始数据集的名称,因此它是50/50列车和测试.
然后我用训练集装了一个glm.
fitglm<- glm(num_claims~year+vt+va+public+pri_bil+persist+penalty_pts+num_veh+num_drivers+married+gender+driver_age+credit+col_ded+car_den, family=poisson, train)
Run Code Online (Sandbox Code Playgroud)
现在我想预测使用这个glm,比如接下来的10个观察结果.
我无法在predict()中指定newdata,
我试过了:
pred<-predict(fitglm,newdata=data.frame(train),type="response", se.fit=T)
Run Code Online (Sandbox Code Playgroud)
这将给出一些等于训练集中样本数量的预测.
最后,如何用置信区间绘制这些预测?
感谢您的帮助