假设我有两列数据.第一个包含诸如"First","Second","Third"等类别.第二个包含代表我看到"First"的次数的数字.
例如:
Category Frequency
First 10
First 15
First 5
Second 2
Third 14
Third 20
Second 3
Run Code Online (Sandbox Code Playgroud)
我想按类别对数据进行排序并对频率求和:
Category Frequency
First 30
Second 5
Third 34
Run Code Online (Sandbox Code Playgroud)
我怎么会在R?
我有一个数据框
ID <- c("A","A","A","A","B","B","B","B")
Type <- c(45,45,46,46,45,45,46,46)
Point_A <- c(10,NA,30,40,NA,80,NA,100)
Point_B <- c(NA,32,43,NA,65,11,NA,53)
df <- data.frame(ID,Type,Point_A,Point_B)
ID Type Point_A Point_B
1 A 45 10 NA
2 A 45 NA 32
3 A 46 30 43
4 A 46 40 NA
5 B 45 NA 65
6 B 45 80 11
7 B 46 NA NA
8 B 46 100 53
Run Code Online (Sandbox Code Playgroud)
虽然我从这篇文章中了解到,但我可以用ID和一列来汇总数据.
我目前正在使用sqldf按ID和类型对行和组进行求和.虽然这对我来说很重要,但它在更大的数据集上却非常缓慢.
df1 <- sqldf("SELECT ID, Type, Sum(Point_A) as Point_A, Sum(Point_A) as Point_A
FROM df
GROUP BY ID, …Run Code Online (Sandbox Code Playgroud)