假设我有一个数据集(忘记分布):
modData <- data.frame("A" = rnorm(20, 15, 3),
"B" = rnorm(20, 20, 3),
"C" = rnorm(20, 25, 3),
"X" = rnorm(20, 5, 1)
)
Run Code Online (Sandbox Code Playgroud)
如果我分别X
用作预测变量A
、B
和C
响应:
md1 <- lm(A ~ X, data = modData)
md2 <- lm(B ~ X, data = modData)
md3 <- lm(C ~ X, data = modData)
Run Code Online (Sandbox Code Playgroud)
然后对每个模型进行 Shapiro 测试和 boxcox 测试,例如:
shapiro.test(residuals(md1))
boxcox(md1, plotit = T)
Run Code Online (Sandbox Code Playgroud)
有没有一种方便的方法来构建和测试多个模型而无需手动输入每个模型?
这是使用的替代方法tidyverse
:
modData <- data.frame("A" = rnorm(20, 15, 3),
"B" = rnorm(20, 20, 3),
"C" = rnorm(20, 25, 3),
"X" = rnorm(20, 5, 1))
library(tidyverse)
library(broom)
# specify predictor and target variables
x = "X"
y = names(modData)[names(modData)!= x]
expand.grid(y,x) %>% # create combinations
mutate(model_id = row_number(), # create model id
frml = paste0(Var1, "~", Var2)) %>% # create model formula
group_by(model_id, Var1, Var2) %>% # group by the above
nest() %>% # nest data
mutate(m = map(data, ~lm(.$frml, data = modData)), # create models
m_table = map(m, ~tidy(.)), # tidy model output
st = map(m, ~shapiro.test(residuals(.)))) -> dt_model_info # shapiro test
# access model info
dt_model_info
dt_model_info$m
dt_model_info$m_table
dt_model_info$st
# another way to access info
dt_model_info %>% unnest(m_table)
Run Code Online (Sandbox Code Playgroud)