mjk*_*tfw 5 r ggplot2 survival-analysis tidyverse ggproto
我想ggplot2通过创建新的stats函数和ggproto对象来实现Cox比例风险模型的诊断。这个想法是从分组(按color,facet_grid等)中受益,以便对所需统计信息(例如mar残差)进行条件计算。在下面的示例中,将对空模型进行重新拟合,并为每个辅助因子水平计算mar。问题是:
data的内部设置的参数ggproto的compute_group方法,而不是实际的数据集。这意味着,数据帧被去除未明确定义的列,aes因此,用户应每次至少提供“事件发生时间”和“事件指示符”列。(除非它们在stat_...功能包装器中进行了硬编码)。可以compute_group以某种方式访问原始数据集中的对应行吗?compute_group的data是独一无二的每个PANEL,这意味着它们不能反映原始数据集的supplited实际rownames ggplot,我们再也不能排除如用完整的型号省略不完整的情况下,除非明确指定/硬编码一些id变量。问题与上面相同。geom图层能否访问y由另一个自定义统计信息(也已绘制)计算的?考虑使用诸如的更平滑的残差散点图。ggplot(data, aes(x = covariate, time = time, event = event)) + stat_funcform() + geom_smooth(aes(y = ..martingales..)),..martingales..实际由来计算stat_funcform。。
# Load & Set data ---------------------------------------------------------
require(data.table)
require(dplyr)
require(ggplot2)
require(survival)
set.seed(1)
dt <- data.table(factor = sample(c(0, 2), 1e3, T),
factor2 = sample(c(0, 2, NA), 1e3, T),
covariate = runif(1e3, 0, 1))
dt[, `:=`(etime = rexp(1e3, .01 * (.5 + factor + covariate)),
ctime = runif(1e3, 0, 1e2))]
dt[, `:=`(event = ifelse(etime < ctime, 1, 0),
time = ifelse(etime < ctime, etime, ctime))]
survfit(Surv(time, event) ~ factor + I(covariate > .5), data = dt) %>%
autoplot(fun = 'cloglog')
# Fit coxph model ---------------------------------------------------------
fit <- coxph(Surv(time, event) ~ factor + covariate, data = dt)
# Define custom stat ------------------------------------------------------
StatFuncform <- ggproto(
"StatFuncform", Stat,
compute_group = function(data, scales, params) {
head(data) %>% print
data$y <- update(params$fit, ~ 1, data = data) %>%
resid(type = 'martingale')
return(data)
}
)
stat_funcform <- function(mapping = NULL, data = NULL, geom = "point",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...) {
layer(
stat = StatFuncform, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
# Plot --------------------------------------------------------------------
# computation unsuccessfull - no 'time' and 'event' variables
ggplot(data = dt,
mapping = aes(x = covariate,
color = factor(event))) +
stat_funcform(params = list(fit = fit)) +
facet_grid(~ factor, labeller = label_both, margins = TRUE)
# ok - 'time' and 'event' specified explicitly
ggplot(data = dt,
mapping = aes(x = covariate, time = time, event = event,
factor2 = factor2,
color = factor(event))) +
stat_funcform(params = list(fit = fit)) +
facet_grid(~ factor, labeller = label_both, margins = TRUE)
Run Code Online (Sandbox Code Playgroud)
我知道已经有实现生存分析(例如诊断了几包survminer)或ggplot2的 autoplot方法,但他们宁愿提供包装,而不是采取默认的优势ggplot2() + stat_语义。
| 归档时间: |
|
| 查看次数: |
406 次 |
| 最近记录: |