如何在 gtsummary 中向“未知”添加百分比

Ell*_*inn 6 r gtsummary

我有一个连续变量,其中有很大比例的未知数。我的顾问要求我将百分比放在该栏中的旁边。这个 reprex 模仿了我想做的事情。

library(tidyverse)
library(gtsummary)

  trial %>%       # included with gtsummary package
  select(trt, age, grade) %>%
  tbl_summary()
Run Code Online (Sandbox Code Playgroud)

我试图将未知数的百分比列在未知数旁边,最好放在括号中。看起来像 11 (5.5%)。

有些人回复了关于丢失的数据如何出现在我的数据集中的请求,这是一个表示

library(gtsummary)
library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.0.3
#> Warning: package 'readr' was built under R version 4.0.3
library(gtsummary)

df<-
  tibble::tribble(
               ~age,       ~sex,  ~race,          ~weight,
  70, "male",  "white",       50,
  57, "female", "african-american",   87,
  64,  "male",  "white",              NA,
  46,  "male",  "white", 49,
  87,  "male",  "hispanic", 51
  )

df %>%
  select(age,sex,race,weight) %>%
  tbl_summary(type = list(age ~ "continuous", weight ~ "continuous"), missing="ifany")
Run Code Online (Sandbox Code Playgroud)

Dan*_*erg 9

有几种方法可以报告丢失率。我将在下面举例说明一些,您可以选择最适合您的解决方案。

  1. 分类变量:我建议您在将数据框传递到 之前使缺失值显式因子级别tbl_summary()。NA 值将不再缺失,并且将像变量的任何其他级别一样被计入。
  2. 连续变量:使用statistic=参数报告缺失率。
  3. 所有变量:用于add_n()报告缺失率
library(gtsummary)

trial %>%      
  select(age, response, trt) %>%
  # making the NA value explicit level of factor with `forcats::fct_explicit_na()`
  dplyr::mutate(response = factor(response) %>% forcats::fct_explicit_na()) %>%
  tbl_summary(
    by = trt,
    type = all_continuous() ~ "continuous2",
    statistic = all_continuous() ~ c("{N_nonmiss}/{N_obs} {p_nonmiss}%",
                                     "{median} ({p25}, {p75})")
  ) %>%
  add_n(statistic = "{n} / {N}")
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

编辑:在原始海报的评论后添加更多示例。

library(gtsummary)

trial %>%      
  select(age, response, trt) %>%
  # making the NA value explicit level of factor with `forcats::fct_explicit_na()`
  dplyr::mutate(response = factor(response) %>% forcats::fct_explicit_na(na_level = "Unknown")) %>%
  tbl_summary(
    by = trt,
    type = all_continuous() ~ "continuous2",
    missing = "no",
    statistic = all_continuous() ~ c("{median} ({p25}, {p75})",
                                     "{N_miss} ({p_miss}%)")
  ) %>%
  # udpating the Unknown label in the `.$table_body`
  modify_table_body(
    dplyr::mutate,
    label = ifelse(label == "N missing (% missing)",
                   "Unknown",
                   label)
  )
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述