我有一张桌子:
id time
1 1
1 2
1 5
2 3
2 2
2 7
3 8
3 3
3 14
Run Code Online (Sandbox Code Playgroud)
我想将其转换为:
id first last
1 1 5
2 3 7
3 8 14
Run Code Online (Sandbox Code Playgroud)
请帮忙!
我们可以用data.table。将'data.frame'转换为'data.table'(setDT(df1)),按'id'分组,我们得到'time'的值first和值last
library(data.table)
setDT(df1)[, list(firstocc = time[1L], lastocc = time[.N]),
by = id]
Run Code Online (Sandbox Code Playgroud)
或者对于dplyr,我们使用相同的方法。
library(dplyr)
df1 %>%
group_by(id) %>%
summarise(firstocc = first(time), lastocc = last(time))
Run Code Online (Sandbox Code Playgroud)
或与base R(无需包)
do.call(rbind, lapply(split(df1, df1$id),
function(x) data.frame(id = x$id[1],
firstocc = x$time[1], lastocc = x$time[nrow(x)])))
Run Code Online (Sandbox Code Playgroud)
如果我们需要基于min和max值(与预期输出无关),则data.table选项是
setDT(df1)[, setNames(as.list(range(time)),
c('firstOcc', 'lastOcc')) ,id]
Run Code Online (Sandbox Code Playgroud)
并且dplyr是
df1 %>%
group_by(id) %>%
summarise(firstocc = min(time), lastocc = max(time))
Run Code Online (Sandbox Code Playgroud)