多级建模的协方差结构

Ann*_*rie 3 r covariance multi-level nlme

我有一个大约300名患者的多级重复测量数据集,每个患者有多达10个预测肌钙蛋白升高的重复测量.数据集中还有其他变量,但我没有在此处包含它们.我试图用来nlme创建随机斜率,随机截距模型,其中患者之间的效果不同,并且不同患者的时间效果不同.当我尝试引入一阶协方差结构以允许由于时间的测量相关时,我得到以下错误消息.

Error in `coef<-.corARMA`(`*tmp*`, value = value[parMap[, i]]) : Coefficient matrix not invertible
Run Code Online (Sandbox Code Playgroud)

我已经包含了我的代码和数据集的样本,我将非常感谢任何智慧的话语.

#baseline model includes only the intercept. Random slopes - intercept varies across patients
randomintercept <- lme(troponin ~ 1,
                       data = df, random = ~1|record_id, method = "ML", 
                       na.action = na.exclude, 
                       control = list(opt="optim"))

#random intercept and time as fixed effect
timeri <- update(randomintercept,.~. + day)
#random slopes and intercept: effect of time is different in different people
timers <- update(timeri, random = ~ day|record_id)
#model covariance structure. corAR1() first order autoregressive covariance structure, timepoints equally spaced
armodel <- update(timers, correlation = corAR1(0, form = ~day|record_id))
Error in `coef<-.corARMA`(`*tmp*`, value = value[parMap[, i]]) : Coefficient matrix not invertible
Run Code Online (Sandbox Code Playgroud)

数据:

record_id   day troponin
1   1   32  
2   0     NA  
2   1   NA  
2   2   NA  
2   3   8  
2   4   6  
2   5   7  
2   6   7  
2   7   7  
2   8   NA  
2   9   9  
3   0   14  
3   1   1167  
3   2   1935  
4   0   19  
4   1   16  
4   2   29  
5   0   NA  
5   1   17  
5   2   47  
5   3   684  
6   0   46  
6   1   45440  
6   2   47085  
7   0   48  
7   1   87  
7   2   44  
7   3   20  
7   4   15  
7   5   11  
7   6   10  
7   7   11  
7   8   197  
8   0   28  
8   1   31  
9   0   NA  
9   1   204  
10  0   NA  
10  1   19  
Run Code Online (Sandbox Code Playgroud)

Ben*_*ker 5

可以,如果你改变你的优化器"nlminb"符合这个(或至少它的工作原理与缩减数据集合你贴).

armodel <- update(timers, 
              correlation = corAR1(0, form = ~day|record_id),
              control=list(opt="nlminb"))
Run Code Online (Sandbox Code Playgroud)

但是,如果你看一下拟合模型,你会发现你有问题 - 估计的AR1参数是-1,随机截距和斜率项与r = 0.998相关.

我认为问题在于数据的性质.大多数数据似乎在10-50范围内,但是有一个或两个数量级的偏移(例如个体6,高达约45000).可能很难将模型拟合到这个尖锐的数据.我强烈建议对数据进行日志转换; 标准诊断图(plot(randomintercept))看起来像这样:

拟合与残差

而适合对数尺度

rlog <- update(randomintercept,log10(troponin) ~ .)
plot(rlog)
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

虽然仍有一些异方差性的证据,但有些更合理.

AR +随机斜率模型适合:

ar.rlog <- update(rlog,
              random = ~day|record_id,
              correlation = corAR1(0, form = ~day|record_id))
## Linear mixed-effects model fit by maximum likelihood
## ...
## Random effects:
##  Formula: ~day | record_id
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 0.1772409 (Intr)
## day         0.6045765 0.992 
## Residual    0.4771523       
##
##  Correlation Structure: ARMA(1,0)
##  Formula: ~day | record_id 
##  Parameter estimate(s):
##       Phi1 
## 0.09181557 
## ...
Run Code Online (Sandbox Code Playgroud)

快速浏览一下就可以intervals(ar.rlog)看出自回归参数的置信区间是(-0.52,0.65),所以可能不值得保持......

随着模型中的随机斜率,异方差性似乎不再成问题......

plot(rlog,sqrt(abs(resid(.)))~fitted(.),type=c("p","smooth"))
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述