Dmy*_*iuk 2 grouping r time-series forecasting dplyr
我的数据集看起来像这样:
Category Weekly_Date a b
<chr> <date> <dbl> <dbl>
1 aa 2018-07-01 36.6 1.4
2 aa 2018-07-02 5.30 0
3 bb 2018-07-01 4.62 1.2
4 bb 2018-07-02 3.71 1.5
5 cc 2018-07-01 3.41 12
... ... ... ... ...
Run Code Online (Sandbox Code Playgroud)
我分别为每个组拟合线性回归:
fit_linreg <- train %>%
group_by(Category) %>%
do(model = lm(Target ~ Unit_price + Unit_discount, data = .))
Run Code Online (Sandbox Code Playgroud)
现在我对每个类别都有不同的模型:
aa model1
bb model2
cc model3
Run Code Online (Sandbox Code Playgroud)
所以,我需要将每个模型应用到适当的类别.怎么实现呢?(dplyr更好)
如果嵌套测试数据的数据,将其与模型连接,则可以使用map2使用经过训练的模型对测试数据进行预测.请参阅下面的mtcars示例.
library(tidyverse)
x <- mtcars %>%
group_by(gear) %>%
do(model = lm(mpg ~ hp + wt, data = .))
x
Source: local data frame [3 x 2]
Groups: <by row>
# A tibble: 3 x 2
gear model
* <dbl> <list>
1 3 <S3: lm>
2 4 <S3: lm>
3 5 <S3: lm>
mtcars %>%
group_by(gear) %>%
nest %>%
inner_join(x) %>%
mutate(preds = map2(model, data, predict)) %>%
unnest(preds)
Joining, by = "gear"
# A tibble: 32 x 2
gear preds
<dbl> <dbl>
1 4 22.0
2 4 21.2
3 4 25.1
4 4 26.0
5 4 22.2
6 4 17.8
7 4 17.8
8 4 28.7
9 4 32.3
10 4 30.0
# ... with 22 more rows
Run Code Online (Sandbox Code Playgroud)