如何使用 dplyr::slice_* 函数重写带有权重的 dplyr::top_n() 调用

its*_*ami 0 r dplyr

我想用top_n()推荐的函数替换下面代码中被取代的调用slice_max(),但我不知道如何请求加权slice_max()

\n
top10 <- \n  structure(\n    list(\n      Variable = c("tfidf_text_crossing", "tfidf_text_best", \n                   "tfidf_text_amazing", "tfidf_text_fantastic",\n                   "tfidf_text_player", "tfidf_text_great",\n                   "tfidf_text_10", "tfidf_text_progress", \n                   "tfidf_text_relaxing", "tfidf_text_fix"), \n      Importance = c(0.428820580430941, 0.412741988094224,\n                     0.368676982306671, 0.361409225854695, \n                     0.331176924533776, 0.307393456208119,\n                     0.293945850296236, 0.286313554816565, \n                     0.283457020779205, 0.27899280757397), \n      Sign = c(tfidf_text_crossing = "POS", tfidf_text_best = "POS", \n               tfidf_text_amazing = "POS", tfidf_text_fantastic = "POS", \n               tfidf_text_player = "NEG", tfidf_text_great = "POS", \n               tfidf_text_10 = "POS", tfidf_text_progress = "NEG", \n               tfidf_text_relaxing = "POS", tfidf_text_fix = "NEG")\n    ), \n    row.names = c(NA, -10L), \n    class = c("vi", "tbl_df", "tbl", "data.frame"), \n    type = "|coefficient|"\n  )\n\nsuppressPackageStartupMessages(library(dplyr))\n\ntop10 |> \n  group_by(Sign) |> \n  top_n(2, wt = abs(Importance))\n#> # A tibble: 4 \xc3\x97 3\n#> # Groups:   Sign [2]\n#>   Variable            Importance Sign \n#>   <chr>                    <dbl> <chr>\n#> 1 tfidf_text_crossing      0.429 POS  \n#> 2 tfidf_text_best          0.413 POS  \n#> 3 tfidf_text_player        0.331 NEG  \n#> 4 tfidf_text_progress      0.286 NEG\n
Run Code Online (Sandbox Code Playgroud)\n

创建于 2023-01-06,使用reprex v2.0.2

\n

我想我会得到正确的答案:

\n
top10 |> \n  group_by(Sign) |> \n  arrange(desc(abs(Importance))) |> \n  slice_head(n = 2)\n
Run Code Online (Sandbox Code Playgroud)\n

但对于我所教的新手来说,这本书的可读性要差得多。有没有一种明显的方法可以使用 slice_* 函数来做到这一点?

\n

r2e*_*ans 6

您可以使用 处理arrange数据order_by=,这应该使其更具可读性(并且它确实模仿了您的top_n代码)。

\n
top10 |>\n  group_by(Sign) |>\n  slice_max(n = 2, order_by = abs(Importance))\n# # A tibble: 4 \xc3\x97 3\n# # Groups:   Sign [2]\n#   Variable            Importance Sign \n#   <chr>                    <dbl> <chr>\n# 1 tfidf_text_player        0.331 NEG  \n# 2 tfidf_text_progress      0.286 NEG  \n# 3 tfidf_text_crossing      0.429 POS  \n# 4 tfidf_text_best          0.413 POS  \n
Run Code Online (Sandbox Code Playgroud)\n