我有一个像这样的数据框:
ID Cont
1 a
1 a
1 b
2 a
2 c
2 d
Run Code Online (Sandbox Code Playgroud)
我需要按ID报告“续”的频率。输出应为
ID Cont Freq
1 a 2
1 b 1
2 a 1
2 c 1
2 d 1
Run Code Online (Sandbox Code Playgroud)
使用dplyr,你可以group_by两者ID并Cont与summarise使用n()得到Freq:
library(dplyr)
res <- df %>% group_by(ID,Cont) %>% summarise(Freq=n())
##Source: local data frame [5 x 3]
##Groups: ID [?]
##
## ID Cont Freq
## <int> <fctr> <int>
##1 1 a 2
##2 1 b 1
##3 2 a 1
##4 2 c 1
##5 2 d 1
Run Code Online (Sandbox Code Playgroud)
数据:
df <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L), Cont = structure(c(1L,
1L, 2L, 1L, 3L, 4L), .Label = c("a", "b", "c", "d"), class = "factor")), .Names = c("ID",
"Cont"), class = "data.frame", row.names = c(NA, -6L))
## ID Cont
##1 1 a
##2 1 a
##3 1 b
##4 2 a
##5 2 c
##6 2 d
Run Code Online (Sandbox Code Playgroud)
library(data.table)
setDT(x)[, .(Freq = .N), by = .(ID, Cont)]
# ID Cont Freq
# 1: 1 a 2
# 2: 1 b 1
# 3: 2 a 1
# 4: 2 c 1
# 5: 2 d 1
Run Code Online (Sandbox Code Playgroud)
以 R 为基数:
df1 <- subset(as.data.frame(table(df)), Freq != 0)
Run Code Online (Sandbox Code Playgroud)
如果您想按 ID 订购,请添加此行:
df1[order(df1$ID)]
ID Cont Freq
1 1 a 2
3 1 b 1
2 2 a 1
6 2 c 1
8 2 d 1
Run Code Online (Sandbox Code Playgroud)