d3h*_*o23 1 r weighted-average dataframe dplyr rolling-computation
我有一个数据框 games_h。这只是表格的一个片段,但它有很多球队,并按日期、球队、比赛编号排序。我正在尝试创建按团队分组的加权滚动平均值。我希望最近一场比赛的权重是两场以上之前的。因此权重将为 (Game_1 * 1+ Game_2 *2)/3 或权重等于 1,且比率相同,因此权重 = c(1-.667, .667)。
dput(games_h)
structure(list(GameId = c(16, 16, 37, 37, 57, 57), GameDate = structure(c(17905,
17905, 17916, 17916, 17926, 17926), class = "Date"), NeutralSite = c(0,
0, 0, 0, 0, 0), AwayTeam = c("Virginia Cavaliers", "Virginia Cavaliers",
"Florida State Seminoles", "Florida State Seminoles", "Syracuse Orange",
"Syracuse Orange"), HomeTeam = c("Boston College Eagles", "Boston College Eagles",
"Boston College Eagles", "Boston College Eagles", "Boston College Eagles",
"Boston College Eagles"), Team = c("Virginia Cavaliers", "Boston College Eagles",
"Florida State Seminoles", "Boston College Eagles", "Syracuse Orange",
"Boston College Eagles"), Home = c(0, 1, 0, 1, 0, 1), Score = c(83,
56, 82, 87, 77, 71), AST = c(17, 6, 12, 16, 11, 13), TOV = c(10,
8, 9, 13, 11, 11), STL = c(5, 4, 4, 6, 6, 5), BLK = c(6, 0, 4,
4, 1, 0), Rebounds = c(38, 18, 36, 33, 23, 23), ORB = c(7, 4,
16, 10, 7, 6), DRB = c(31, 14, 20, 23, 16, 17), FGA = c(55, 57,
67, 55, 52, 45), FGM = c(33, 22, 28, 27, 29, 21), X3FGM = c(8,
7, 8, 13, 11, 9), X3FGA = c(19, 25, 25, 21, 26, 22), FTA = c(14,
9, 24, 28, 15, 23), FTM = c(9, 5, 18, 20, 8, 20), Fouls = c(16,
12, 25, 20, 19, 19), Game_Number = 1:6, Count = c(1, 1, 1, 1,
1, 1)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L), groups = structure(list(HomeTeam = "Boston College Eagles",
.rows = structure(list(1:6), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L), .drop = TRUE))
Run Code Online (Sandbox Code Playgroud)
以下是分数列的输出示例。
这是我失败的尝试。该函数工作正常,但我无法按组将其应用到整个数据框。
weighted_avg<-function(x, wt1, wt2) {
rs1 = rollsum(x,1,align = "right")
rs2 = rollsum(x,2,align = "right")
rs1=rs1[-1]
rs3 = rs2 - rs1
weighted_avg= ((rs3 * wt2)+ (rs1*wt1))/(wt1+wt2)
return(weighted_avg)
}
weighted_avg(csum$Score_Y, 2, 1)
apply(csum$Score_Y , 2, weighted_avg, wt1 = 2, wt2=1)
test<-csum %>%
group_by(Team)%>%
group_map(across(c(Score:Fouls), weighted_avg(.x$Team, 2, 1) ))
test<-csum %>%
group_by(Team)%>%
group_walk(across(c(Score:Fouls),weighted_avg(.~,2,1) ))
Run Code Online (Sandbox Code Playgroud)
以下是有关代码的一些注释:
\nslider::slide_dbl函数。首先,我们指定要计算移动平均值的向量Score。.before在中使用了参数slide_dbl来使用前一个值和当前值来计算移动平均值。.complete参数设置TRUE为 确保仅在有先前值时才计算移动平均值。换句话说,我们在第一行没有任何移动平均线。有关更多信息,请查看滑块包的文档。
\nlibrary(tidyverse)\nlibrary(slider)\n\ndf %>%\n group_by(HomeTeam) %>%\n summarise(Example = c(NA, slide_dbl(Score, .before = 1, .complete = TRUE, \n .f = ~ (.x[1] * 1 + .x[2] * 2) / 3)))\n\n`summarise()` has grouped output by \'HomeTeam\'. You can override using the `.groups` argument.\n# A tibble: 7 \xc3\x97 2\n# Groups: HomeTeam [1]\n HomeTeam Example\n <chr> <dbl>\n1 Boston College Eagles NA \n2 Boston College Eagles NA \n3 Boston College Eagles 65 \n4 Boston College Eagles 73.3\n5 Boston College Eagles 85.3\n6 Boston College Eagles 80.3\n7 Boston College Eagles 73 \nRun Code Online (Sandbox Code Playgroud)\n如果要计算所有数字列的移动平均值,您可以尝试:
\ndf %>%\n group_by(HomeTeam) %>%\n summarise(across(where(is.numeric), ~ c(NA, slide_dbl(., .before = 1, .complete = TRUE, \n .f = ~ (.x[1] * 1 + .x[2] * 2) / 3)))) %>%\n ungroup()\n\n`summarise()` has grouped output by \'HomeTeam\'. You can override using the `.groups` argument.\n# A tibble: 7 \xc3\x97 21\n HomeTeam GameId NeutralSite Home Score AST TOV STL BLK Rebounds ORB DRB FGA FGM\n <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>\n1 Boston C\xe2\x80\xa6 NA NA NA NA NA NA NA NA NA NA NA NA NA \n2 Boston C\xe2\x80\xa6 NA NA NA NA NA NA NA NA NA NA NA NA NA \n3 Boston C\xe2\x80\xa6 16 0 0.667 65 9.67 8.67 4.33 2 24.7 5 19.7 56.3 25.7\n4 Boston C\xe2\x80\xa6 30 0 0.333 73.3 10 8.67 4 2.67 30 12 18 63.7 26 \n5 Boston C\xe2\x80\xa6 37 0 0.667 85.3 14.7 11.7 5.33 4 34 12 22 59 27.3\n6 Boston C\xe2\x80\xa6 50.3 0 0.333 80.3 12.7 11.7 6 2 26.3 8 18.3 53 28.3\n7 Boston C\xe2\x80\xa6 57 0 0.667 73 12.3 11 5.33 0.333 23 6.33 16.7 47.3 23.7\n# \xe2\x80\xa6 with 7 more variables: X3FGM <dbl>, X3FGA <dbl>, FTA <dbl>, FTM <dbl>, Fouls <dbl>,\n# Game_Number <dbl>, Count <dbl>\nRun Code Online (Sandbox Code Playgroud)\n