den*_*nis 8 r ode differential-equations nlme non-linear-regression
I am facing a problem I do not manage to solve. I would like to use nlme or nlmODE to perform a non linear regression with random effect using as a model the solution of a second order differential equation with fixed coefficients (a damped oscillator).
I manage to use nlme with simple models, but it seems that the use of deSolve to generate the solution of the differential equation causes a problem. Below an example, and the problems I face.
Here is the function to generate the solution of the differential equation using deSolve:
library(deSolve)
ODE2_nls <- function(t, y, parms) {
S1 <- y[1]
dS1 <- y[2]
dS2 <- dS1
dS1 <- - parms["esp2omega"]*dS1 - parms["omega2"]*S1 + parms["omega2"]*parms["yeq"]
res <- c(dS2,dS1)
list(res)}
solution_analy_ODE2 = function(omega2,esp2omega,time,y0,v0,yeq){
parms <- c(esp2omega = esp2omega,
omega2 = omega2,
yeq = yeq)
xstart = c(S1 = y0, dS1 = v0)
out <- lsoda(xstart, time, ODE2_nls, parms)
return(out[,2])
}
Run Code Online (Sandbox Code Playgroud)
I can generate a solution for a given period and damping factor, as for example here a period of 20 and a slight damping of 0.2:
library(deSolve)
ODE2_nls <- function(t, y, parms) {
S1 <- y[1]
dS1 <- y[2]
dS2 <- dS1
dS1 <- - parms["esp2omega"]*dS1 - parms["omega2"]*S1 + parms["omega2"]*parms["yeq"]
res <- c(dS2,dS1)
list(res)}
solution_analy_ODE2 = function(omega2,esp2omega,time,y0,v0,yeq){
parms <- c(esp2omega = esp2omega,
omega2 = omega2,
yeq = yeq)
xstart = c(S1 = y0, dS1 = v0)
out <- lsoda(xstart, time, ODE2_nls, parms)
return(out[,2])
}
Run Code Online (Sandbox Code Playgroud)
Now I generate a panel of 10 individuals with a random starting phase (i.e. different starting position and velocity). The goal is to perform a non linear regression with random effect on the starting values
library(data.table)
# generate panel
Npoint <- 100 # number of time poitns
Nindiv <- 10 # number of individuals
period <- 20 # period of oscillation
amort_factor <- 0.2
omega <- 2*pi/period # agular frequency
# random phase
phase <- sample(seq(0,2*pi,0.01),Nindiv)
# simu data:
data_simu <- data.table(time = rep(1:Npoint,Nindiv), ID = rep(1:Nindiv,each = Npoint))
# signal generation
data_simu[,signal := solution_analy_ODE2(omega2 = omega^2,
esp2omega = 2*0.2*omega,
time = time,
y0 = sin(phase[.GRP]),
v0 = omega*cos(phase[.GRP]),
yeq = 0)+
rnorm(.N,0,0.02),by = ID]
Run Code Online (Sandbox Code Playgroud)
If we have a look, we have a proper dataset:
library(ggplot2)
ggplot(data_simu,aes(time,signal,color = ID))+
geom_line()+
facet_wrap(~ID)
Run Code Online (Sandbox Code Playgroud)
Using nlme with similar syntax working on simpler examples (non linear functions not using deSolve), I tried:
fit <- nlme(model = signal ~ solution_analy_ODE2(esp2omega,omega2,time,y0,v0,yeq),
data = data_simu,
fixed = esp2omega + omega2 + y0 + v0 + yeq ~ 1,
random = y0 ~ 1 ,
groups = ~ ID,
start = c(esp2omega = 0.08,
omega2 = 0.04,
yeq = 0,
y0 = 1,
v0 = 0))
Run Code Online (Sandbox Code Playgroud)
I obtain:
Error in checkFunc(Func2, times, y, rho) : The number of derivatives returned by func() (2) must equal the length of the initial conditions vector (2000)
The traceback:
12. stop(paste("The number of derivatives returned by func() (", length(tmp[[1]]), ") must equal the length of the initial conditions vector (", length(y), ")", sep = ""))
11. checkFunc(Func2, times, y, rho)
10. lsoda(xstart, time, ODE2_nls, parms)
9. solution_analy_ODE2(omega2, esp2omega, time, y0, v0, yeq)
.
.
Run Code Online (Sandbox Code Playgroud)
I looks like nlme is trying to pass a vector of starting condition to solution_analy_ODE2, and causes an error in checkFunc from lasoda.
I tried using nlsList:
test <- nlsList(model = signal ~ solution_analy_ODE2(omega2,esp2omega,time,y0,v0,yeq) | ID,
data = data_simu,
start = list(esp2omega = 0.08, omega2 = 0.04,yeq = 0,
y0 = 1,v0 = 0),
control = list(maxiter=150, warnOnly=T,minFactor = 1e-10),
na.action = na.fail, pool = TRUE)
head(test)
Run Code Online (Sandbox Code Playgroud)
Call:
Model: signal ~ solution_analy_ODE2(omega2, esp2omega, time, y0, v0, yeq) | ID
Data: data_simu
Coefficients:
esp2omega omega2 yeq y0 v0
1 0.1190764 0.09696076 0.0007577956 -0.1049423 0.30234654
2 0.1238936 0.09827158 -0.0003463023 0.9837386 0.04773775
3 0.1280399 0.09853310 -0.0004908579 0.6051663 0.25216134
4 0.1254053 0.09917855 0.0001922963 -0.5484005 -0.25972829
5 0.1249473 0.09884761 0.0017730823 0.7041049 0.22066652
6 0.1275408 0.09966155 -0.0017522320 0.8349450 0.17596648
Run Code Online (Sandbox Code Playgroud)
We can see that te non linear fit works well on individual signals. Now if I want to perform a regression of the dataset with random effects, the syntax should be:
fit <- nlme(test,
random = y0 ~ 1 ,
groups = ~ ID,
start = c(esp2omega = 0.08,
omega2 = 0.04,
yeq = 0,
y0 = 1,
v0 = 0))
Run Code Online (Sandbox Code Playgroud)
But I obtain the exact same error message.
I then tried using nlmODE, following Bne Bolker's comment on a similar question I asked some years ago
library(nlmeODE)
datas_grouped <- groupedData( signal ~ time | ID, data = data_simu,
labels = list (x = "time", y = "signal"),
units = list(x ="arbitrary", y = "arbitrary"))
modelODE <- list( DiffEq = list(dS2dt = ~ S1,
dS1dt = ~ -esp2omega*S1 - omega2*S2 + omega2*yeq),
ObsEq = list(yc = ~ S2),
States = c("S1","S2"),
Parms = c("esp2omega","omega2","yeq","ID"),
Init = c(y0 = 0,v0 = 0))
resnlmeode = nlmeODE(modelODE, datas_grouped)
assign("resnlmeode", resnlmeode, envir = .GlobalEnv)
#Fitting with nlme the resulting function
model <- nlme(signal ~ resnlmeode(esp2omega,omega2,yeq,time,ID),
data = datas_grouped,
fixed = esp2omega + omega2 + yeq + y0 + v0 ~ 1,
random = y0 + v0 ~1,
start = c(esp2omega = 0.08,
omega2 = 0.04,
yeq = 0,
y0 = 0,
v0 = 0)) #
Run Code Online (Sandbox Code Playgroud)
I get the error:
Error in resnlmeode(esp2omega, omega2, yeq, time, ID) : object 'yhat' not found
Here I don't understand where the error comes from, nor how to solve it.
nlme or nlmODE ?nlmixr (https://cran.r-project.org/web/packages/nlmixr/index.html), but I don't know it, the instalation is complicated and it was recently remove from CRAN@tpetzoldt suggested a nice way to debug nlme behavior, and it surprised me a lot. Here is a working example with a non linear function, where I generate a set of 5 individual with a random parameter varying between individuals :
reg_fun = function(time,b,A,y0){
cat("time : ",length(time)," b :",length(b)," A : ",length(A)," y0: ",length(y0),"\n")
out <- A*exp(-b*time)+(y0-1)
cat("out : ",length(out),"\n")
tmp <- cbind(b,A,y0,time,out)
cat(apply(tmp,1,function(x) paste(paste(x,collapse = " "),"\n")),"\n")
return(out)
}
time <- 0:10*10
ramdom_y0 <- sample(seq(0,1,0.01),10)
Nid <- 5
data_simu <-
data.table(time = rep(time,Nid),
ID = rep(LETTERS[1:Nid],each = length(time)) )[,signal := reg_fun(time,0.02,2,ramdom_y0[.GRP]) + rnorm(.N,0,0.1),by = ID]
Run Code Online (Sandbox Code Playgroud)
The cats in the function give here:
time : 11 b : 1 A : 1 y0: 1
out : 11
0.02 2 0.64 0 1.64
0.02 2 0.64 10 1.27746150615596
0.02 2 0.64 20 0.980640092071279
0.02 2 0.64 30 0.737623272188053
0.02 2 0.64 40 0.538657928234443
0.02 2 0.64 50 0.375758882342885
0.02 2 0.64 60 0.242388423824404
0.02 2 0.64 70 0.133193927883213
0.02 2 0.64 80 0.0437930359893108
0.02 2 0.64 90 -0.0294022235568269
0.02 2 0.64 100 -0.0893294335267746
.
.
.
Run Code Online (Sandbox Code Playgroud)
现在我做nlme:
nlme(model = signal ~ reg_fun(time,b,A,y0),
data = data_simu,
fixed = b + A + y0 ~ 1,
random = y0 ~ 1 ,
groups = ~ ID,
start = c(b = 0.03, A = 1,y0 = 0))
Run Code Online (Sandbox Code Playgroud)
我得到:
time : 55 b : 55 A : 55 y0: 55
out : 55
0.03 1 0 0 0
0.03 1 0 10 -0.259181779318282
0.03 1 0 20 -0.451188363905974
0.03 1 0 30 -0.593430340259401
0.03 1 0 40 -0.698805788087798
0.03 1 0 50 -0.77686983985157
0.03 1 0 60 -0.834701111778413
0.03 1 0 70 -0.877543571747018
0.03 1 0 80 -0.909282046710588
0.03 1 0 90 -0.93279448726025
0.03 1 0 100 -0.950212931632136
0.03 1 0 0 0
0.03 1 0 10 -0.259181779318282
0.03 1 0 20 -0.451188363905974
0.03 1 0 30 -0.593430340259401
0.03 1 0 40 -0.698805788087798
0.03 1 0 50 -0.77686983985157
0.03 1 0 60 -0.834701111778413
0.03 1 0 70 -0.877543571747018
0.03 1 0 80 -0.909282046710588
0.03 1 0 90 -0.93279448726025
0.03 1 0 100 -0.950212931632136
0.03 1 0 0 0
0.03 1 0 10 -0.259181779318282
0.03 1 0 20 -0.451188363905974
0.03 1 0 30 -0.593430340259401
0.03 1 0 40 -0.698805788087798
0.03 1 0 50 -0.77686983985157
0.03 1 0 60 -0.834701111778413
0.03 1 0 70 -0.877543571747018
0.03 1 0 80 -0.909282046710588
0.03 1 0 90 -0.93279448726025
0.03 1 0 100 -0.950212931632136
0.03 1 0 0 0
0.03 1 0 10 -0.259181779318282
0.03 1 0 20 -0.451188363905974
0.03 1 0 30 -0.593430340259401
0.03 1 0 40 -0.698805788087798
0.03 1 0 50 -0.77686983985157
0.03 1 0 60 -0.834701111778413
0.03 1 0 70 -0.877543571747018
0.03 1 0 80 -0.909282046710588
0.03 1 0 90 -0.93279448726025
0.03 1 0 100 -0.950212931632136
0.03 1 0 0 0
0.03 1 0 10 -0.259181779318282
0.03 1 0 20 -0.451188363905974
0.03 1 0 30 -0.593430340259401
0.03 1 0 40 -0.698805788087798
0.03 1 0 50 -0.77686983985157
0.03 1 0 60 -0.834701111778413
0.03 1 0 70 -0.877543571747018
0.03 1 0 80 -0.909282046710588
0.03 1 0 90 -0.93279448726025
0.03 1 0 100 -0.950212931632136
time : 55 b : 55 A : 55 y0: 55
out : 55
0.03 1 0 0 0
0.03 1 0 10 -0.259181779318282
0.03 1 0 20 -0.451188363905974
0.03 1 0 30 -0.593430340259401
0.03 1 0 40 -0.698805788087798
0.03 1 0 50 -0.77686983985157
0.03 1 0 60 -0.834701111778413
0.03 1 0 70 -0.877543571747018
0.03 1 0 80 -0.909282046710588
0.03 1 0 90 -0.93279448726025
0.03 1 0 100 -0.950212931632136
0.03 1 0 0 0
0.03 1 0 10 -0.259181779318282
0.03 1 0 20 -0.451188363905974
0.03 1 0 30 -0.593430340259401
0.03 1 0 40 -0.698805788087798
0.03 1 0 50 -0.77686983985157
0.03 1 0 60 -0.834701111778413
0.03 1 0 70 -0.877543571747018
0.03 1 0 80 -0.909282046710588
0.03 1 0 90 -0.93279448726025
0.03 1 0 100 -0.950212931632136
0.03 1 0 0 0
0.03 1 0 10 -0.259181779318282
0.03 1 0 20 -0.451188363905974
0.03 1 0 30 -0.593430340259401
0.03 1 0 40 -0.698805788087798
0.03 1 0 50 -0.77686983985157
0.03 1 0 60 -0.834701111778413
0.03 1 0 70 -0.877543571747018
0.03 1 0 80 -0.909282046710588
0.03 1 0 90 -0.93279448726025
0.03 1 0 100 -0.950212931632136
...
Run Code Online (Sandbox Code Playgroud)
所以nlme绑定 5 次(个体的数量)时间向量并将其传递给函数,参数重复相同的次数。这当然与方式lsoda和我的功能不兼容。
It seems that the ode model is called with a wrong argument, so that it gets a vector with 2000 state variables instead of 2. Try the following to see the problem:
ODE2_nls <- function(t, y, parms) {
cat(length(y),"\n") # <----
S1 <- y[1]
dS1 <- y[2]
dS2 <- dS1
dS1 <- - parms["esp2omega"]*dS1 - parms["omega2"]*S1 + parms["omega2"]*parms["yeq"]
res <- c(dS2,dS1)
list(res)
}
Run Code Online (Sandbox Code Playgroud)
Edit: I think that the analytical function worked, because it is vectorized, so you may try to vectorize the ode function, either by iterating over the ode model or (better) internally using vectors as state variables. As ode is fast in solving systems with several 100k equations, 2000 should be feasible.
I guess that both, states and parameters from nlme are passed as vectors. The state variable of the ode model is then a "long" vector, the parameters can be implemented as a list.
Here an example (edited, now with parameters as list):
ODE2_nls <- function(t, y, parms) {
#cat(length(y),"\n")
#cat(length(parms$omega2))
ndx <- seq(1, 2*N-1, 2)
S1 <- y[ndx]
dS1 <- y[ndx + 1]
dS2 <- dS1
dS1 <- - parms$esp2omega * dS1 - parms$omega2 * S1 + parms$omega2 * parms$yeq
res <- c(dS2, dS1)
list(res)
}
solution_analy_ODE2 = function(omega2, esp2omega, time, y0, v0, yeq){
parms <- list(esp2omega = esp2omega, omega2 = omega2, yeq = yeq)
xstart = c(S1 = y0, dS1 = v0)
out <- ode(xstart, time, ODE2_nls, parms, atol=1e-4, rtol=1e-4, method="ode45")
return(out[,2])
}
Run Code Online (Sandbox Code Playgroud)
Then set (or calculate) the number of equations, e.g. N <- 1 resp. N <-1000 before the calls.
The model runs through this way, before running in numerical issues, but that's another story ...
You may then try to use another ode solver (e.g. vode), set atoland rtol to lower values, tweak nmle's optimization parameters, use box constraints ... and so on, as usual in nonlinear optimization.