jna*_*m27 7 r survival-analysis
我有一个survfit对象.对我t=0:50多年感兴趣的总结幸存者很容易.
summary(survfit, t=0:50)
Run Code Online (Sandbox Code Playgroud)
它给出了每个t的生存.
有没有办法解决每个t的危险(在这种情况下,每个t = 0:50从t-1到t的危险)?我想获得与Kaplan Meier曲线相关的危险的平均值和置信区间(或标准误差).
当分布适合时(例如type="hazard"in flexsurvreg),这似乎很容易做到,但我无法弄清楚如何为常规的幸存对象做这个.建议?
这有点棘手,因为危险是对瞬时概率的估计(这是离散数据),但该basehaz函数可能有一些帮助,但它只返回累积危险.所以你仍然需要执行额外的步骤.
我也幸运的是这个muhaz功能.从其文件:
library(muhaz)
?muhaz
data(ovarian, package="survival")
attach(ovarian)
fit1 <- muhaz(futime, fustat)
plot(fit1)
Run Code Online (Sandbox Code Playgroud)

我不确定获得95%置信区间的最佳方法,但自举可能是一种方法.
#Function to bootstrap hazard estimates
haz.bootstrap <- function(data,trial,min.time,max.time){
library(data.table)
data <- as.data.table(data)
data <- data[sample(1:nrow(data),nrow(data),replace=T)]
fit1 <- muhaz(data$futime, data$fustat,min.time=min.time,max.time=max.time)
result <- data.table(est.grid=fit1$est.grid,trial,haz.est=fit1$haz.est)
return(result)
}
#Re-run function to get 1000 estimates
haz.list <- lapply(1:1000,function(x) haz.bootstrap(data=ovarian,trial=x,min.time=0,max.time=744))
haz.table <- rbindlist(haz.list,fill=T)
#Calculate Mean,SD,upper and lower 95% confidence bands
plot.table <- haz.table[, .(Mean=mean(haz.est),SD=sd(haz.est)), by=est.grid]
plot.table[, u95 := Mean+1.96*SD]
plot.table[, l95 := Mean-1.96*SD]
#Plot graph
library(ggplot2)
p <- ggplot(data=plot.table)+geom_smooth(aes(x=est.grid,y=Mean))
p <- p+geom_smooth(aes(x=est.grid,y=u95),linetype="dashed")
p <- p+geom_smooth(aes(x=est.grid,y=l95),linetype="dashed")
p
Run Code Online (Sandbox Code Playgroud)

小智 5
作为迈克答案的补充,可以通过泊松分布而不是正态分布来模拟事件的数量.然后可以通过伽马分布计算危险率.代码将变为:
library(muhaz)
library(data.table)
library(rGammaGamma)
data(ovarian, package="survival")
attach(ovarian)
fit1 <- muhaz(futime, fustat)
plot(fit1)
#Function to bootstrap hazard estimates
haz.bootstrap <- function(data,trial,min.time,max.time){
library(data.table)
data <- as.data.table(data)
data <- data[sample(1:nrow(data),nrow(data),replace=T)]
fit1 <- muhaz(data$futime, data$fustat,min.time=min.time,max.time=max.time)
result <- data.table(est.grid=fit1$est.grid,trial,haz.est=fit1$haz.est)
return(result)
}
#Re-run function to get 1000 estimates
haz.list <- lapply(1:1000,function(x) haz.bootstrap(data=ovarian,trial=x,min.time=0,max.time=744))
haz.table <- rbindlist(haz.list,fill=T)
#Calculate Mean, gamma parameters, upper and lower 95% confidence bands
plot.table <- haz.table[, .(Mean=mean(haz.est),
Shape = gammaMME(haz.est)["shape"],
Scale = gammaMME(haz.est)["scale"]), by=est.grid]
plot.table[, u95 := qgamma(0.95,shape = Shape + 1, scale = Scale)]
# The + 1 is due to the discrete character of the poisson distribution.
plot.table[, l95 := qgamma(0.05,shape = Shape, scale = Scale)]
#Plot graph
ggplot(data=plot.table) +
geom_line(aes(x=est.grid, y=Mean),col="blue") +
geom_ribbon(aes(x=est.grid, y=Mean, ymin=l95, ymax=u95),alpha=0.5, fill= "lightblue")
Run Code Online (Sandbox Code Playgroud)

可以看出,危险率下限的负面估计现在已经消失.