ggplot2 stat_function,带有facet_grid内不同数据子集的计算参数

Lyn*_*nda 5 r curve-fitting facet ggplot2

我有一个关于如何将fitdistr计算的args 传递给的后续问题stat_function(请参阅此处的上下文).

我的数据框是这样的(请参阅下面的完整数据集):

> str(small_data)
 'data.frame':  1032 obs. of  3 variables:
 $ Exp: Factor w/ 6 levels "1L","2L","3L",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ t  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ int: num  75.7 86.1 76.3 82.3 98.3 ...
Run Code Online (Sandbox Code Playgroud)

我想绘制一个facet_grid分组Expt显示密度直方图,int并绘制拟合的对数正态分布(由t着色的对数正态线).我尝试过以下方法:

library(MASS)

meanlog <- function(x) { fitdistr(x,"lognormal")$estimate[[1]] }
sdlog <- function(x) { fitdistr(x,"lognormal")$estimate[[2]] }
p_chip<- ggplot(small_data,(aes(x=int)))+
  facet_grid(Exp~t)+
  stat_function(fun=dlnorm,
                args = with(small_data,
                  c(meanlog = meanlog(int),
                    sdlog = sdlog(int))),
                aes(colour=t))+
  scale_colour_gradient2(low='red',mid='blue',high='green',midpoint=5)+
  geom_histogram(aes(x=int,y = ..density..),binwidth =150)
Run Code Online (Sandbox Code Playgroud)

但是with,meanlogsdlog使用整个数据集来计算meanlog和sdlog,如下所示(所有方面的曲线都相同).我怎样才能把它做配件只在右边Exp,t子集?

结果情节

编辑:因为某些环境中的大型数据集在复制/粘贴中创建了错误,这里是一个较小的集合,应该更容易复制粘贴.但是它并不直接对应于上面的图像

small_data<-data.frame(Exp=c('1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','1L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L','2L'),t=c(0,0,0,0.33,0.33,0.33,0.67,0.67,0.67,0.67,0.67,0.67,0.67,0.67,0.67,1,1,1,1,1.33,1.33,1.33,1.33,1.33,1.33,1.33,1.33,1.33,1.33,1.33,1.67,1.67,1.67,1.67,1.67,2,2,2,2,4,4,4,4,6,6,6,6,8,8,10,10,10,10,10,10,10,0,0,0,0,0.33,0.33,0.67,0.67,0.67,0.67,0.67,0.67,1,1,1,1,1.33,1.33,1.33,1.33,1.67,1.67,1.67,1.67,1.67,2,2,4,4,4,4,4,6,6,6,8,10,10,10,10,10,10),int=c(123.059145129225,122.520943007553,119.229495472186,163.349124924562,157.235229958189,101.456442831216,111.474216664325,99.982866933181,274.938909090909,147.40293040293,310.134596211366,116.476923076923,182.25272382757,332.75885911841,186.54737080689,479.628657282935,477.898496240602,283.311517925248,567.147534189805,494.208102667338,388.615060940221,624.508012820513,795.2320925868,549.957142857143,923.04146100691,621.26579261025,717.577954847278,511.907210538479,443.562731447193,391.730061349693,495.384824667473,430.430866037423,157.39336711193,621.531297709924,415.420508401551,440.780570409982,414.551266085513,446.503836734694,255.059685999741,355.922701246211,308.996825396825,200.726012503398,297.958043579045,166.873177083333,184.450355103746,558.391405073555,182.63632183908,320.197666318356,151.874083846379,314.008287813147,125.941419000172,151.284729448491,605.400970873786,143.730810479547,240.779288537549,139.011736015851,498.179183673469,498.899700037495,923.604765506808,1302.60915123996,471.794167269222,239.522509225092,534.769484464503,566.458609271523,337.121275121275,343.216533124878,250.47206095791,585.740563784042,873.775097783572,758.63260265514,561.869607843137,817.746869756034,461.11271165024,406.232050773503,897.39966367713,756.734451942367,605.242334066503,637.310763256886,721.862398822664,898.142725315288,670.916794425087,922.623940368313,1088.8436714166,969.805583375062,986.695448585877,645.589644637402,981.861218195836,541.388875932836,1309.12344123945,925.446478133674,629.419699499165,1589.24284959626,814.736442884637,904.710338680927,947.911413969336,1481.51339495535,1007.30852694893,563.355241171884))
.
Run Code Online (Sandbox Code Playgroud)

jlh*_*ard 5

这是不可能的stat_function(...)- 请参阅此链接,尤其是Hadley Wickham的评论.

你必须以艰难的方式去做,也就是说,计算外部的函数值ggplot.幸运的是,这并不是那么困难.

library(MASS)
library(ggplot2)
df <- aggregate(int~Exp+t,small_data,
                function(z)with(fitdistr(z,"lognormal"),c(estimate[1],estimate[2])))
df <- data.frame(df[,1:2],df[,3])
x  <- with(small_data,seq(min(int),max(int),len=100))
gg <- data.frame(x=rep(x,each=nrow(df)),df)
gg$y <- with(gg,dlnorm(x,meanlog,sdlog))
ggplot(small_data,(aes(x=int)))+
  geom_histogram(aes(x=int,y = ..density..),binwidth =150,
                 color="grey50",fill="lightgreen")+
  geom_line(data=gg, aes(x,y,color=t))+
  facet_grid(Exp~t)+
  scale_colour_gradient2(low='red',mid='blue',high='green',midpoint=5)
Run Code Online (Sandbox Code Playgroud)

因此,这代码创建的数据帧df包含meanlogsdlog用于的每个组合Expt.然后,我们创建一个"auxillary数据帧", gg,其中有一组覆盖你的范围在x值的int100步,和复制,对于每一个组合Expt,我们添加使用y值的列dlnorm(x,meanlog,sdlog).然后我们使用geom_line图层gg作为数据集添加到图中.

请注意,fitdistr(...)并不总是收敛,因此您应该检查NAs in df.