为什么这段代码会产生错误的P值?

Dio*_*ion 1 r function cox-regression survival-analysis p-value

我正在尝试计算与随时变系数的Cox PH模型获得的点估计相关的P值。我编写的函数没有提供正确的P值。我将通过使用生存包中的NCCTG肺癌数据来说明这一点。

# Setup
require(survival)

# Effect of Karnofsky score, linear
fit <- coxph(Surv(time/365.25, status == 2) ~ ph.karno + tt(ph.karno), 
             lung, tt=function(x, t, ...) {x*t})
Run Code Online (Sandbox Code Playgroud)

功能:

# Same function but now with a P-value in the output
calculate.timeDependentHazard.P <- function(model,time) {
  index.1 <- which(names(model$coef)=="ph.karno")
  index.2 <- which(names(model$coef)=="tt(ph.karno)")

  coef <- model$coef[c(index.1,index.2)]
  var <- rbind(c(model$var[index.1,index.1],model$var[index.1,index.2]),
               c(model$var[index.2,index.1],model$var[index.2,index.2]))

  var.at.time <- t(c(1,time)) %*% var %*% c(1,time)

  hazard.at.time <- t(c(1,time)) %*% coef

  lower.95 <- hazard.at.time - 1.96*sqrt(var.at.time)
  upper.95 <- hazard.at.time + 1.96*sqrt(var.at.time)

  z.at.time <- hazard.at.time/(sqrt(var.at.time))

  p.value <- pnorm(-abs(z.at.time))

  results <- c(exp(c(hazard.at.time,lower.95,upper.95)),p.value)
  names(results) <- c("hazard ratio","95% lower","95% upper","P.value")

  options(scipen = 999)

  results

}

# Point estimates after 1.05*365.25 = 383.5 days of follow-up
calculate.timeDependentHazard.P(fit,1.05)
Run Code Online (Sandbox Code Playgroud)

输出:

> calculate.timeDependentHazard.P(fit,1.05)
hazard ratio    95% lower    95% upper      P.value 
  0.98913256   0.97654719   1.00188013   0.04721342
Run Code Online (Sandbox Code Playgroud)

显然,P值应该> .05,但不然不是。通过这种方法计算出的P值似乎太低。任何人都可以发现该缺陷吗?

Ben*_*sen 5

似乎您想要一个双面选择,所以乘以pnorm(-abs(z.at.time))2。即,做2*pnorm(-abs(z.at.time))