R提供最大值和最小值,但除了从整个向量中排序而不是从此向量中选取值x之外,我没有看到在序列中找到另一个值的快速方法.
是否有更快的方法来获得第二高的值(例如)?
谢谢
Rob*_*man 190
使用的partial
参数sort()
.第二个最高值:
n <- length(x)
sort(x,partial=n-1)[n-1]
Run Code Online (Sandbox Code Playgroud)
Pao*_*olo 49
稍微慢一点的替代品,仅用于记录:
x <- c(12.45,34,4,0,-234,45.6,4)
max( x[x!=max(x)] )
min( x[x!=min(x)] )
Run Code Online (Sandbox Code Playgroud)
Zac*_*ach 29
我将Rob的答案包含在稍微更通用的功能中,可用于查找第2,第3,第4(等)max:
maxN <- function(x, N=2){
len <- length(x)
if(N>len){
warning('N greater than length(x). Setting N=length(x)')
N <- length(x)
}
sort(x,partial=len-N+1)[len-N+1]
}
maxN(1:10)
Run Code Online (Sandbox Code Playgroud)
Ste*_*nos 17
Rfast有一个名为nth_element的函数,它完全按照您的要求执行,并且比上面讨论的所有实现都快
还是基于局部排序上面讨论的方法,不支持找到k个最小的值
Rfast::nth(x, 5, descending = T)
Run Code Online (Sandbox Code Playgroud)
将返回x的第五大元素,而
Rfast::nth(x, 5, descending = F)
Run Code Online (Sandbox Code Playgroud)
将返回x的第5个最小元素
以下针对最受欢迎的答案的基准.
一万个数字:
N = 10000
x = rnorm(N)
maxN <- function(x, N=2){
len <- length(x)
if(N>len){
warning('N greater than length(x). Setting N=length(x)')
N <- length(x)
}
sort(x,partial=len-N+1)[len-N+1]
}
microbenchmark::microbenchmark(
Rfast = Rfast::nth(x,5,descending = T),
maxn = maxN(x,5),
order = x[order(x, decreasing = T)[5]]
)
Unit: microseconds
expr min lq mean median uq max neval
Rfast 160.364 179.607 202.8024 194.575 210.1830 351.517 100
maxN 396.419 423.360 559.2707 446.452 487.0775 4949.452 100
order 1288.466 1343.417 1746.7627 1433.221 1500.7865 13768.148 100
Run Code Online (Sandbox Code Playgroud)
对于1000 万个数字:
N = 1e6 #evaluates to 1 million
x = rnorm(N)
microbenchmark::microbenchmark(
Rfast = Rfast::nth(x,5,descending = T),
maxN = maxN(x,5),
order = x[order(x, decreasing = T)[5]]
)
Unit: milliseconds
expr min lq mean median uq max neval
Rfast 89.7722 93.63674 114.9893 104.6325 120.5767 204.8839 100
maxN 150.2822 207.03922 235.3037 241.7604 259.7476 336.7051 100
order 930.8924 968.54785 1005.5487 991.7995 1031.0290 1164.9129 100
Run Code Online (Sandbox Code Playgroud)
Dav*_*yan 15
这是一种查找向量中N个最小值/最大值的索引的简单方法(N = 3的示例):
N <- 3
Run Code Online (Sandbox Code Playgroud)
N最小:
ndx <- order(x)[1:N]
Run Code Online (Sandbox Code Playgroud)
N最大:
ndx <- order(x, decreasing = T)[1:N]
Run Code Online (Sandbox Code Playgroud)
因此,您可以将值提取为:
x[ndx]
Run Code Online (Sandbox Code Playgroud)
给你...套件显然是赢家!
N = 1e6
x = rnorm(N)
maxN <- function(x, N=2){
len <- length(x)
if(N>len){
warning('N greater than length(x). Setting N=length(x)')
N <- length(x)
}
sort(x,partial=len-N+1)[len-N+1]
}
microbenchmark::microbenchmark(
Rfast = Rfast::nth(x,5,descending = T),
maxN = maxN(x,5),
order = x[order(x, decreasing = T)[5]],
kit = x[kit::topn(x, 5L,decreasing = T)[5L]]
)
# Unit: milliseconds
# expr min lq mean median uq max neval
# Rfast 12.311168 12.473771 16.36982 12.702134 16.110779 102.749873 100
# maxN 12.922118 13.124358 17.49628 18.977537 20.053139 28.928694 100
# order 50.443100 50.926975 52.54067 51.270163 52.323116 66.561606 100
# kit 1.177202 1.216371 1.29542 1.240228 1.297286 2.771715 100
Run Code Online (Sandbox Code Playgroud)
编辑:我忘记了kit::topn
有hasna
选项......让我们再运行一次。
microbenchmark::microbenchmark(
Rfast = Rfast::nth(x,5,descending = T),
maxN = maxN(x,5),
order = x[order(x, decreasing = T)[5]],
kit = x[kit::topn(x, 5L,decreasing = T)[5L]],
kit2 = x[kit::topn(x, 5L,decreasing = T,hasna = F)[5L]],
unit = "ms"
)
# Unit: milliseconds
# expr min lq mean median uq max neval
# Rfast 13.194314 13.358787 14.7227116 13.4560340 14.551194 24.524105 100
# maxN 7.378960 7.527661 10.0747803 7.7119715 12.217756 67.409526 100
# order 50.088927 50.488832 52.4714347 50.7415680 52.267003 70.062662 100
# kit 1.180698 1.217237 1.2975441 1.2429790 1.278243 3.263202 100
# kit2 0.842354 0.876329 0.9398055 0.9109095 0.944407 2.135903 100
Run Code Online (Sandbox Code Playgroud)
小智 5
对于第n个最高值,
sort(x, TRUE)[n]
Run Code Online (Sandbox Code Playgroud)
小智 5
这是我找到的最简单的方法,
num <- c(5665,1615,5154,65564,69895646)
num <- sort(num, decreasing = F)
tail(num, 1) # Highest number
head(tail(num, 2),1) # Second Highest number
head(tail(num, 3),1) # Third Highest number
head(tail(num, n),1) # Generl equation for finding nth Highest number
Run Code Online (Sandbox Code Playgroud)