dplyr 按字符串汇总

Question

dplyr 按字符串汇总

我有一个包含数字和字符串值的数据框，例如：

 mydf <- data.frame(id = c(1, 2, 1, 2, 3, 4),
               value = c(32, 12, 43, 6, 50, 20),
               text = c('A', 'B', 'A', 'B', 'C', 'D'))

Run Code Online (Sandbox Code Playgroud)

id变量的值总是对应于text变量，例如，id == 1永远是text == 'A'。

现在，我想通过id（或通过text，因为它是同一件事）总结这个数据框：

mydf %>%
  group_by(id) %>%
  summarize(mean_value = mean(value))

Run Code Online (Sandbox Code Playgroud)

这很好用，但我也需要text变量，因为我不想进行文本分析。

但是，当我添加text到 dplyr 管道时：

mydf %>%
  group_by(id) %>%
  summarize(mean_value = mean(value),
  text = text)

Run Code Online (Sandbox Code Playgroud)

我收到以下错误：

错误：期望一个值

由于textforid始终相同，是否可以将其附加到汇总的数据帧中？

Answer 1

zx8*_*754 5

summarize功能需要申请上输入一些功能，所以我们可以保留text了出来，并连同保持id内group_by，或使用first函数内summarize：

# text should be in group_by to show up in result
mydf %>%
  group_by(id, text) %>%
  summarize(mean_value = mean(value))

# or within summarise use first function, to take the first value when grouped
mydf %>%
  group_by(id) %>%
  summarize(mean_value = mean(value),
            text = first(text))

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，1 月前
查看次数：	6725 次
最近记录：	8 年，11 月前