R中两列的频率计数

Sun*_*nny 26 r

我在数据框中有两列

2010  1
2010  1
2010  2
2010  2
2010  3
2011  1
2011  2
Run Code Online (Sandbox Code Playgroud)

我想计算两列的频率并以此格式得到结果

  y    m Freq
 2010  1 2
 2010  2 2
 2010  3 1
 2011  1 1
 2011  2 1 
Run Code Online (Sandbox Code Playgroud)

dan*_*kas 34

如果您的数据是df包含列y和的数据框m

library(plyr)
counts <- ddply(df, .(df$y, df$m), nrow)
names(counts) <- c("y", "m", "Freq")
Run Code Online (Sandbox Code Playgroud)

  • @DMactheDestroyer大声笑.试试`SQL`标签. (3认同)

Ric*_*ven 10

我还没有看到dplyr的回答.代码很简单.

library(dplyr)
rename(count(df, y, m), Freq = n)
# Source: local data frame [5 x 3]
# Groups: V1 [?]
#
#       y     m  Freq
#   (int) (int) (int)
# 1  2010     1     2
# 2  2010     2     2
# 3  2010     3     1
# 4  2011     1     1
# 5  2011     2     1
Run Code Online (Sandbox Code Playgroud)

数据:

df <- structure(list(y = c(2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 
2011L), m = c(1L, 1L, 2L, 2L, 3L, 1L, 2L)), .Names = c("y", "m"
), class = "data.frame", row.names = c(NA, -7L))
Run Code Online (Sandbox Code Playgroud)

  • 2022 更新:`count(df, y, m, name = "Freq")` (3认同)

Ric*_*ard 8

一个更惯用的数据.表格版本@ ugh的答案是:

library(data.table) # load package
df <- data.frame(y = c(rep(2010, 5), rep(2011,2)), m = c(1,1,2,2,3,1,2)) # setup data
dt <- data.table(df) # transpose to data.table
dt[, list(Freq =.N), by=list(y,m)] # use list to name var directly
Run Code Online (Sandbox Code Playgroud)


Ksh*_*tij 5

使用sqldf

sqldf("SELECT y, m, COUNT(*) as Freq
       FROM table1
       GROUP BY y, m")
Run Code Online (Sandbox Code Playgroud)