我想知道是否可以在dplyr包的mutate()中使用lm().目前我的数据框"date","company","return"和"market.ret"可重现如下:
library(dplyr)
n.dates <- 60
n.stocks <- 2
date <- seq(as.Date("2011-07-01"), by=1, len=n.dates)
symbol <- replicate(n.stocks, paste0(sample(LETTERS, 5), collapse = ""))
x <- expand.grid(date, symbol)
x$return <- rnorm(n.dates*n.stocks, 0, sd = 0.05)
names(x) <- c("date", "company", "return")
x <- group_by(x, date)
x <- mutate(x, market.ret = mean(x$return, na.rm = TRUE))
Run Code Online (Sandbox Code Playgroud)
现在,对于每个公司,我想通过"market.ret"进行"返回",计算线性回归系数并将斜率存储在新列中.我希望用mutate()来做,但下面的代码不起作用:
x <- group_by(x, company)
x <- mutate(x, beta = coef(lm(x$return~x$market.ret))[[2]])
Run Code Online (Sandbox Code Playgroud)
R报告的错误是:
Error in terms.formula(formula, data = data) :
invalid term in model formula
Run Code Online (Sandbox Code Playgroud)
提前感谢任何建议!
这似乎对我有用:
group_by(x, company) %>%
do(data.frame(beta = coef(lm(return ~ market.ret,data = .))[2])) %>%
left_join(x,.)
Run Code Online (Sandbox Code Playgroud)