R - Levenberg Marquardt非线性最小二乘拟合Heligman Pollard模型参数

Art*_*ose 3 r nonlinear-functions nonlinear-optimization model-fitting levenberg-marquardt

我试图重现Kostakis的纸张解决方案.在本文中,使用de Heligman-Pollard模型将删节死亡率表扩展为完整的生命表.该模型有8个参数必须安装.作者使用了改进的Gauss-Newton算法; 该算法(E04FDF)是NAG计算机程序库的一部分.Levenberg Marquardt不应该产生相同的参数集吗?我的代码或LM算法的应用有什么问题?

library(minpack.lm)


## Heligman-Pollard is used to expand an abridged table.
## nonlinear least squares algorithm is used to fit the parameters on nqx observed over 5 year   intervals (5qx)
AGE <- c(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70)
MORTALITY <- c(0.010384069, 0.001469140, 0.001309318, 0.003814265, 0.005378395, 0.005985625,     0.006741766, 0.009325056, 0.014149626, 0.021601755, 0.034271934, 0.053836246, 0.085287751, 0.136549522, 0.215953304)

## The start parameters for de Heligman-Pollard Formula (Converged set a=0.0005893,b=0.0043836,c=0.0828424,d=0.000706,e=9.927863,f=22.197312,g=0.00004948,h=1.10003)
## I modified a random parameter "a" in order to have a start values. The converged set is listed above. 
parStart <- list(a=0.0008893,b=0.0043836,c=0.0828424,d=0.000706,e=9.927863,f=22.197312,g=0.00004948,h=1.10003)

## The Heligman-Pollard Formula (HP8) = qx/px = ...8 parameter equation
HP8 <-function(parS,x)
ifelse(x==0, parS$a^((x+parS$b)^parS$c) + parS$g*parS$h^x, 
             parS$a^((x+parS$b)^parS$c) + parS$d*exp(-parS$e*(log(x/parS$f))^2) +
                 parS$g*parS$h^x)

## Define qx = HP8/(1+HP8)
qxPred <- function(parS,x) HP8(parS,x)/(1+HP8(parS,x))

## Calculate nqx predicted by HP8 model (nqxPred(parStart,x))
nqxPred <- function(parS,x)
(1 -(1-qxPred(parS,x)) * (1-qxPred(parS,x+1)) *
    (1-qxPred(parS,x+2)) * (1-qxPred(parS,x+3)) *
    (1-qxPred(parS,x+4))) 

##Define Residual Function, the relative squared distance is minimized  
ResidFun <- function(parS, Observed,x) (nqxPred(parS,x)/Observed-1)^2

## Applying the nls.lm algo. 
nls.out <- nls.lm(par=parStart, fn = ResidFun, Observed = MORTALITY, x = AGE,
                  control = nls.lm.control(nprint=1,
                                           ftol = .Machine$double.eps,
                                           ptol = .Machine$double.eps,
                                           maxfev=10000, maxiter = 500))

summary(nls.out)


## The author used a modified Gauss-Newton algorithm, this alogorithm (E04FDF) is part of the NAG library of computer programs
## Should not Levenberg Marquardt yield the same set of parameters
Run Code Online (Sandbox Code Playgroud)

Ben*_*ker 12

这里的底线是@Roland是绝对正确的,这是一个非常不适合的问题,你不一定希望得到可靠的答案.我在下面

  • 以一些小的方式清理代码(这只是审美)
  • 改变了ResidFun返回残差,而不是残差平方.(前者是正确的,但这并没有太大区别.)
  • 探索了几个不同优化器的结果.它实际上看起来你得到的答案比你上面列出的"融合参数" 更好,我假设它是原始研究中的参数(你能提供参考吗?).

加载包:

library(minpack.lm)
Run Code Online (Sandbox Code Playgroud)

数据,作为数据框:

d <- data.frame(
   AGE = seq(0,70,by=5),
   MORTALITY=c(0.010384069, 0.001469140, 0.001309318, 0.003814265,
               0.005378395, 0.005985625, 0.006741766, 0.009325056,
               0.014149626, 0.021601755, 0.034271934, 0.053836246,
               0.085287751, 0.136549522, 0.215953304))
Run Code Online (Sandbox Code Playgroud)

首先查看数据:

library(ggplot2)
(g1 <- ggplot(d,aes(AGE,MORTALITY))+geom_point())
g1+geom_smooth()  ## with loess fit
Run Code Online (Sandbox Code Playgroud)

参数选择:

据推测这些是原始论文中的参数......

parConv <- c(a=0.0005893,b=0.0043836,c=0.0828424,
             d=0.000706,e=9.927863,f=22.197312,g=0.00004948,h=1.10003)
Run Code Online (Sandbox Code Playgroud)

扰动参数:

parStart <- parConv
parStart["a"] <- parStart["a"]+3e-4
Run Code Online (Sandbox Code Playgroud)

公式:

HP8 <-function(parS,x)
    with(as.list(parS),
         ifelse(x==0, a^((x+b)^c) + g*h^x, 
                a^((x+b)^c) + d*exp(-e*(log(x/f))^2) + g*h^x))
## Define qx = HP8/(1+HP8)
qxPred <- function(parS,x) {
    h <- HP8(parS,x)
    h/(1+h)
}
## Calculate nqx predicted by HP8 model (nqxPred(parStart,x))
nqxPred <- function(parS,x)
    (1 -(1-qxPred(parS,x)) * (1-qxPred(parS,x+1)) *
     (1-qxPred(parS,x+2)) * (1-qxPred(parS,x+3)) *
     (1-qxPred(parS,x+4))) 
##Define Residual Function, the relative squared distance is minimized  
ResidFun <- function(parS, Observed,x) (nqxPred(parS,x)/Observed-1)
Run Code Online (Sandbox Code Playgroud)

这是从OP的版本略有改变; nls.lm想要残差,而不是残差平方.

与其他优化器一起使用的平方和函数:

ssqfun <- function(parS, Observed, x) {
   sum(ResidFun(parS, Observed, x)^2)
}
Run Code Online (Sandbox Code Playgroud)

申请nls.lm.(不确定为什么ftolptol降低sqrt(.Machine$double.eps).Machine$double.eps- 前者通常是对精度的实际限制......

nls.out <- nls.lm(par=parStart, fn = ResidFun,
                  Observed = d$MORTALITY, x = d$AGE,
                  control = nls.lm.control(nprint=0,
                                           ftol = .Machine$double.eps,
                                           ptol = .Machine$double.eps,
                                           maxfev=10000, maxiter = 1000))

parNLS <- coef(nls.out)

pred0 <- nqxPred(as.list(parConv),d$AGE)
pred1 <- nqxPred(as.list(parNLS),d$AGE)

dPred <- with(d,rbind(data.frame(AGE,MORTALITY=pred0,w="conv"),
               data.frame(AGE,MORTALITY=pred1,w="nls")))

g1 + geom_line(data=dPred,aes(colour=w))
Run Code Online (Sandbox Code Playgroud)

线条难以区分,但参数有一些很大的差异:

round(cbind(parNLS,parConv),5)
##     parNLS  parConv
## a  1.00000  0.00059
## b 50.46708  0.00438
## c  3.56799  0.08284
## d  0.00072  0.00071
## e  6.05200  9.92786
## f 21.82347 22.19731
## g  0.00005  0.00005
## h  1.10026  1.10003
Run Code Online (Sandbox Code Playgroud)

d,f,g,h接近,但a,b,c是不同的数量级,e是50%不同.

看看原始方程式,这里发生的是a^((x+b)^c)设置为常数,因为a接近1:一次a约为1,b并且c基本上是不相关的.

让我们检查相关性(我们需要一个广义逆,因为矩阵是如此强相关):

obj <- nls.out
vcov  <- with(obj,deviance/(length(fvec) - length(par)) * 
              MASS::ginv(hessian))

cmat <- round(cov2cor(vcov),1)
dimnames(cmat) <- list(letters[1:8],letters[1:8])

##      a    b    c    d    e    f    g    h
## a  1.0  0.0  0.0  0.0  0.0  0.0 -0.1  0.0
## b  0.0  1.0 -1.0  1.0 -1.0 -1.0 -0.4 -1.0
## c  0.0 -1.0  1.0 -1.0  1.0  1.0  0.4  1.0
## d  0.0  1.0 -1.0  1.0 -1.0 -1.0 -0.4 -1.0
## e  0.0 -1.0  1.0 -1.0  1.0  1.0  0.4  1.0
## f  0.0 -1.0  1.0 -1.0  1.0  1.0  0.4  1.0
## g -0.1 -0.4  0.4 -0.4  0.4  0.4  1.0  0.4
## h  0.0 -1.0  1.0 -1.0  1.0  1.0  0.4  1.0
Run Code Online (Sandbox Code Playgroud)

这实际上并没有那么有用 - 它确实只是证实了许多变量是强相关的......

library(optimx)
mvec <- c('Nelder-Mead','BFGS','CG','L-BFGS-B',
          'nlm','nlminb','spg','ucminf')
opt1 <- optimx(par=parStart, fn = ssqfun,
         Observed = d$MORTALITY, x = d$AGE,
               itnmax=5000,
               method=mvec,control=list(kkt=TRUE))
               ## control=list(all.methods=TRUE,kkt=TRUE)) ## Boom!

##         fvalues      method fns  grs itns conv KKT1 KKT2 xtimes
## 2 8.988466e+307        BFGS  NA NULL NULL 9999   NA   NA      0
## 3 8.988466e+307          CG  NA NULL NULL 9999   NA   NA      0
## 4 8.988466e+307    L-BFGS-B  NA NULL NULL 9999   NA   NA      0
## 5 8.988466e+307         nlm  NA   NA   NA 9999   NA   NA      0
## 7     0.3400858         spg   1   NA    1    3   NA   NA  0.064
## 8     0.3400858      ucminf   1    1 NULL    0   NA   NA  0.032
## 1    0.06099295 Nelder-Mead 501   NA NULL    1   NA   NA  0.252
## 6   0.009275733      nlminb 200 1204  145    1   NA   NA  0.708
Run Code Online (Sandbox Code Playgroud)

这警告了不良的缩放,并且还发现了各种不同的答案:只ucminf声称已经收敛,但nlminb得到了更好的答案 - 并且itnmax参数似乎被忽略了......

opt2 <- nlminb(start=parStart, objective = ssqfun,
         Observed = d$MORTALITY, x = d$AGE,                   
               control= list(eval.max=5000,iter.max=5000))

parNLM <- opt2$par
Run Code Online (Sandbox Code Playgroud)

完成,但有一个虚假的收敛警告......

round(cbind(parNLS,parConv,parNLM),5)

##     parNLS  parConv   parNLM
## a  1.00000  0.00059  1.00000
## b 50.46708  0.00438 55.37270
## c  3.56799  0.08284  3.89162
## d  0.00072  0.00071  0.00072
## e  6.05200  9.92786  6.04416
## f 21.82347 22.19731 21.82292
## g  0.00005  0.00005  0.00005
## h  1.10026  1.10003  1.10026

sapply(list(parNLS,parConv,parNLM),
       ssqfun,Observed=d$MORTALITY,x=d$AGE)
## [1] 0.006346250 0.049972367 0.006315034
Run Code Online (Sandbox Code Playgroud)

它看起来像nlminbminpack.lm越来越相似的答案,实际上是做更好比原先规定的参数(由相当多的):

pred2 <- nqxPred(as.list(parNLM),d$AGE)

dPred <- with(d,rbind(dPred,
               data.frame(AGE,MORTALITY=pred2,w="nlminb")))

g1 + geom_line(data=dPred,aes(colour=w))
ggsave("cmpplot.png")
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

ggplot(data=dPred,aes(x=AGE,y=MORTALITY-d$MORTALITY,colour=w))+
   geom_line()+geom_point(aes(shape=w),alpha=0.3)
ggsave("residplot.png")
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

其他可以尝试的事情是:

  • 适当的缩放 - 虽然对此的快速测试似乎没有那么多帮助
  • 提供分析梯度
  • 使用AD Model Builder
  • 使用slice函数from bbmle来探索旧参数和新参数是否代表不同的最小值,或者旧参数是否只是一个错误的收敛...
  • optimx相关包中获取KKT(Karsh-Kuhn-Tucker)标准计算器以进行类似检查

PS:最大的偏差(到目前为止)是最老的年龄组,可能也有小样本.从统计学的角度来看,可能值得做一个由各个点的精度加权的拟合...