如何根据第二列中的max选择重复行(仅基于第一列):
data<-data.frame(a=c(1,3,3,3),b=c(1,4,6,3),d=c(1,5,7,1))
a b d
1 1 1
3 4 5
3 6 7
3 3 1
a b d
1 1 1
3 6 7
Run Code Online (Sandbox Code Playgroud)
在第二列中,6最大值在4,6,3之间
您可以使用"dplyr"尝试以下内容:
library(dplyr)
data %>% ## Your data
group_by(a) %>% ## grouped by "a"
filter(b == max(b)) ## filtered to only include the rows where b == max(b)
# Source: local data frame [2 x 3]
# Groups: a
#
# a b d
# 1 1 1 1
# 2 3 6 7
Run Code Online (Sandbox Code Playgroud)
但请注意,如果还有更多匹配的行,b == max(b)也会返回.因此,替代方案可能是:
data %>% ## Your data
group_by(a) %>% ## grouped by "a"
arrange(desc(b)) %>% ## sorted by descending values of "b"
slice(1) ## with just the first row extracted
Run Code Online (Sandbox Code Playgroud)