重塑数据以与 geeglm() 一起使用

use*_*028 5 r

您能帮我弄清楚为什么我会收到错误吗?

最初我的数据如下所示:

> attributes(compl)$names
 [1] "UserID"         "compl_bin"      "Sex.x"          "PHQ_base"       "PHQ_Surv1"      "PHQ_Surv2"      "PHQ_Surv3"    
 [8] "PHQ_Surv4"      "EFE"            "Neuro"          "Intervention.x" "depr0"          "error1_1.x"     "error1_2.x"   
[15] "error1_3.x"     "error1_4.x"     "stress0"        "stress1"        "stress2"        "stress3"        "stress4"      
[22] "hours1"         "hours2"         "hours3"         "hours4"         "subject"       
Run Code Online (Sandbox Code Playgroud)

首先,我重塑数据以准备 geeglm:

compl$subject <- factor(rownames(compl))
nobs <- nrow(compl) 
compl_long <- reshape(compl, idvar = "subject",
                      varying = list(c("PHQ_Surv1", "PHQ_Surv2" ,
                                       "PHQ_Surv3", "PHQ_Surv4"), 
                                     c("error1_1.x", "error1_2.x",
                                       "error1_3.x", "error1_4.x"), 
                                     c("stress1", "stress2", "stress3",
                                       "stress4"), 
                                     c("hours1", "hours2", "hours3",
                                       "hours4")), 
                      v.names = c("PHQ", "error", "stress", "hours"),
                      times = c("1", "2", "3", "4"), direction = "long")
Run Code Online (Sandbox Code Playgroud)

-(编者注:不确定下一个输出是什么......)

 [1] "UserID"         "compl_bin"      "Sex.x"          "PHQ_base"       "EFE"            "Neuro"          "Intervention.x"
 [8] "depr0"          "stress0"        "subject"        "time"           "PHQ"            "error"          "stress"       
[15] "hours" 
Run Code Online (Sandbox Code Playgroud)

然后我使用 geeglm 函数:

library(geepack)

geeSand=(geeglm(PHQ~as.factor(compl_bin) + Neuro+PHQ_base+as.factor(depr0) +
                    EFE+as.factor(Sex.x) + as.factor(error)+stress+hours,
                    family = poisson, data=compl_long,
                    id=subject, corst="exchangeable"))
Run Code Online (Sandbox Code Playgroud)

我收到错误:

"Error in geese.fit(xx, yy, id, offset, soffset, w, waves = waves, zsca,  : 
  nrow(zsca) and length(y) not match"
Run Code Online (Sandbox Code Playgroud)

如果我删除变量 as.factor(error) 和小时,geeglm 不会抱怨,并且我会得到输出。该函数不适用于错误和小时变量。我检查了所有变量的长度,它们是相等的。你能帮我找出问题所在吗?

非常感谢!

小智 3

发现这个: https: //stat.ethz.ch/pipermail/r-help/2008-October/178337.html

我很确定这是 geese() 中的一个错误,应该将其报告给
geepack 的维护者。问题在于缺失
值的处理。

如果看一下,dim(na.omit(dat[,c("id","score","chem","time")]))就会
得到 44。在 geese.fit() 中,zsca 设置为等于矩阵(1,N,1),其中 N 设置
为等于 length(id)。但 id 的长度为 46,而响应 y 已通过消除 缺少所涉及的
任何变量的任何数据行而被修剪至长度 44 。
因此出现了一个问题。

该问题的解决需要
geepack的维护者重新编写一些代码。