如何从R中的survfit中提取危害?

jna*_*m27 7 r survival-analysis

我有一个survfit对象.对我t=0:50多年感兴趣的总结幸存者很容易.

summary(survfit, t=0:50)
Run Code Online (Sandbox Code Playgroud)

它给出了每个t的生存.

有没有办法解决每个t的危险(在这种情况下,每个t = 0:50从t-1到t的危险)?我想获得与Kaplan Meier曲线相关的危险的平均值和置信区间(或标准误差).

当分布适合时(例如type="hazard"in flexsurvreg),这似乎很容易做到,但我无法弄清楚如何为常规的幸存对象做这个.建议?

Mik*_*han 6

这有点棘手,因为危险是对瞬时概率的估计(这是离散数据),但该basehaz函数可能有一些帮助,但它只返回累积危险.所以你仍然需要执行额外的步骤.

我也幸运的是这个muhaz功能.从其文件:

library(muhaz)
?muhaz
data(ovarian, package="survival")
attach(ovarian)
fit1 <- muhaz(futime, fustat)
plot(fit1)
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

我不确定获得95%置信区间的最佳方法,但自举可能是一种方法.

#Function to bootstrap hazard estimates
haz.bootstrap <- function(data,trial,min.time,max.time){
  library(data.table)
  data <- as.data.table(data)
  data <- data[sample(1:nrow(data),nrow(data),replace=T)]
  fit1 <- muhaz(data$futime, data$fustat,min.time=min.time,max.time=max.time)
  result <- data.table(est.grid=fit1$est.grid,trial,haz.est=fit1$haz.est)
  return(result)
}

#Re-run function to get 1000 estimates
haz.list <- lapply(1:1000,function(x) haz.bootstrap(data=ovarian,trial=x,min.time=0,max.time=744))
haz.table <- rbindlist(haz.list,fill=T)

#Calculate Mean,SD,upper and lower 95% confidence bands
plot.table <- haz.table[, .(Mean=mean(haz.est),SD=sd(haz.est)), by=est.grid]
plot.table[, u95 := Mean+1.96*SD]
plot.table[, l95 := Mean-1.96*SD]

#Plot graph
library(ggplot2)
p <- ggplot(data=plot.table)+geom_smooth(aes(x=est.grid,y=Mean))
p <- p+geom_smooth(aes(x=est.grid,y=u95),linetype="dashed")
p <- p+geom_smooth(aes(x=est.grid,y=l95),linetype="dashed")
p
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述


小智 5

作为迈克答案的补充,可以通过泊松分布而不是正态分布来模拟事件的数量.然后可以通过伽马分布计算危险率.代码将变为:

library(muhaz)
library(data.table)
library(rGammaGamma)
data(ovarian, package="survival")
attach(ovarian)
fit1 <- muhaz(futime, fustat)
plot(fit1)

#Function to bootstrap hazard estimates
haz.bootstrap <- function(data,trial,min.time,max.time){
  library(data.table)
  data <- as.data.table(data)
  data <- data[sample(1:nrow(data),nrow(data),replace=T)]
  fit1 <- muhaz(data$futime, data$fustat,min.time=min.time,max.time=max.time)
  result <- data.table(est.grid=fit1$est.grid,trial,haz.est=fit1$haz.est)
  return(result)
}

#Re-run function to get 1000 estimates
haz.list <- lapply(1:1000,function(x) haz.bootstrap(data=ovarian,trial=x,min.time=0,max.time=744))
haz.table <- rbindlist(haz.list,fill=T)

#Calculate Mean, gamma parameters, upper and lower 95% confidence bands
plot.table <- haz.table[, .(Mean=mean(haz.est),
                            Shape = gammaMME(haz.est)["shape"],
                            Scale = gammaMME(haz.est)["scale"]), by=est.grid]
plot.table[, u95 := qgamma(0.95,shape = Shape + 1, scale = Scale)]
# The + 1 is due to the discrete character of the poisson distribution.
plot.table[, l95 := qgamma(0.05,shape = Shape, scale = Scale)]

#Plot graph
ggplot(data=plot.table) + 
  geom_line(aes(x=est.grid, y=Mean),col="blue") + 
  geom_ribbon(aes(x=est.grid, y=Mean, ymin=l95, ymax=u95),alpha=0.5, fill= "lightblue")
Run Code Online (Sandbox Code Playgroud)

危险率的情节,95%置信区间

可以看出,危险率下限的负面估计现在已经消失.