我想按其ID取平均值,但并非所有ID都具有相同数量的值。如何在R中执行此操作?
我有两列ID和Value
ID Value
1000 0.51
1000 0.01
1001 0.81
1001 0.41
1001 0.62
1002 0.98
1002 0.12
1002 0.15
1003 0.12
... ...
Run Code Online (Sandbox Code Playgroud)
您可以尝试by():
> with(df, by(Value, ID, mean))
# ID: 1000
# [1] 0.26
# ------------------------------------------------------------
# ID: 1001
# [1] 0.6133333
# ------------------------------------------------------------
# ID: 1002
# [1] 0.4166667
# ------------------------------------------------------------
# ID: 1003
# [1] 0.12
Run Code Online (Sandbox Code Playgroud)
或aggregate():
> aggregate( Value ~ ID, df, mean)
# ID Value
# 1 1000 0.2600000
# 2 1001 0.6133333
# 3 1002 0.4166667
# 4 1003 0.1200000
Run Code Online (Sandbox Code Playgroud)
或使用data.table(如果您需要对大型数据集进行快速计算):
> library(data.table)
> setDT(df)[, mean(Value), by = ID]
# ID V1
# 1: 1000 0.2600000
# 2: 1001 0.6133333
# 3: 1002 0.4166667
# 4: 1003 0.1200000
Run Code Online (Sandbox Code Playgroud)
数据
df <- structure(list(ID = c(1000L, 1000L, 1001L, 1001L, 1001L, 1002L,
1002L, 1002L, 1003L), Value = c(0.51, 0.01, 0.81, 0.41, 0.62,
0.98, 0.12, 0.15, 0.12)), .Names = c("ID", "Value"),
class = "data.frame", row.names = c(NA, -9L))
Run Code Online (Sandbox Code Playgroud)