如何对不同行的值求和并汇总为一行 (R)

vsx*_*x99 4 r dataframe

我的员工付款数据显示为一行 = 一个付款记录。变量描述了名称、付款方式和价值。

我的最终目标是拥有一个数据框,其中每个员工 = 一行,其中汇总了不同类型的付款,并且每种付款类型都有自己的变量。

请看例子:

data <- data.frame("name" = c("John", "John", "John", "Marie", "Marie", "Alex"),
               "payment.reason" = c("bonus", "bonus", "commission", "commission", "commission", "discretionary bonus"),
               "value" = c(1000, 5000, 2500, 1500, 500, 2500))
Run Code Online (Sandbox Code Playgroud)

看起来像这样:

   name      payment.reason value
1  John               bonus  1000
2  John               bonus  5000
3  John          commission  2500
4 Marie          commission  1500
5 Marie          commission   500
6  Alex discretionary bonus  2500
Run Code Online (Sandbox Code Playgroud)

这是我追求的最终结果:

goal
   name bonus commission discretionary.bonus
1  John  6000       2500                   0
2 Marie     0       2000                   0
3  Alex     0          0                2500
Run Code Online (Sandbox Code Playgroud)

我知道我需要传播数据以将 payment.reason 值推送到列中,但我正在努力弄清楚如何对每个人的每个单独的付款类型值求和,并让数据按每个人分组。

先感谢您!

MSR*_*MSR 7

我们可以使用pivot_widerin完成所有这些tidyr

library(tidyr)

pivot_wider(data, name, names_from = payment.reason, values_from = value, values_fn = list(value = sum))
#> # A tibble: 3 x 4
#>   name  bonus commission `discretionary bonus`
#>   <fct> <dbl>      <dbl>                 <dbl>
#> 1 John   6000       2500                    NA
#> 2 Marie    NA       2000                    NA
#> 3 Alex     NA         NA                  2500
Run Code Online (Sandbox Code Playgroud)

reprex 包(v0.3.0)于 2019 年 12 月 23 日创建

请注意(如@AlexB 的回答),values_fill = list(value = 0)如果您需要显式0s 而不是NA.

  • 很好地使用了“values_fn” (2认同)

akr*_*run 6

我们可以使用dcastfromdata.table并利用fun.aggregate

library(data.table)
dcast(setDT(data), name ~ payment.reason, value.var = 'value', sum)
#    name bonus commission discretionary bonus
#1:  Alex     0          0                2500
#2:  John  6000       2500                   0
#3: Marie     0       2000                   0
Run Code Online (Sandbox Code Playgroud)

或者xtabsbase R

xtabs(value ~ name + payment.reason, data)
#    payment.reason
#name    bonus commission discretionary bonus
#  Alex      0          0                2500
#  John   6000       2500                   0
#  Marie     0       2000                   0
Run Code Online (Sandbox Code Playgroud)


Sve*_*enB 5

library(tidyr)    
data %>%
  group_by(name, payment.reason) %>%
  summarise(value = sum(value)) %>%
  pivot_wider(name, names_from = payment.reason,  values_from = value, values_fill = list(value = 0))

  name  `discretionary bonus` bonus commission
  <fct>                 <dbl> <dbl>      <dbl>
1 Alex                   2500     0          0
2 John                      0  6000       2500
3 Marie                     0     0       2000
Run Code Online (Sandbox Code Playgroud)