N处平均精度的更快R实现

Zac*_*ach 2 information-retrieval r average-precision

优秀的Metrics包提供了计算平均精度的功能: apk.

问题是,它基于for循环,而且速度很慢:

require('Metrics')
require('rbenchmark')
actual <- 1:20000
predicted <- c(1:20, 200:600, 900:1522, 14000:32955)
benchmark(replications=10,
          apk(5000, actual, predicted),
          columns= c("test", "replications", "elapsed", "relative"))

                          test replications elapsed relative
1 apk(5000, actual, predicted)           10   53.68        1
Run Code Online (Sandbox Code Playgroud)

我对如何对这个函数进行矢量化感到困惑,但我想知道是否有更好的方法在R中实现它.

flo*_*del 5

我不得不同意实施看起来很糟糕......试试这个:

apk2 <- function (k, actual, predicted)  {

    predicted <- head(predicted, k)

    is.new <- rep(FALSE, length(predicted))
    is.new[match(unique(predicted), predicted)] <- TRUE

    is.relevant <- predicted %in% actual & is.new

    score <- sum(cumsum(is.relevant) * is.relevant / seq_along(predicted)) /
             min(length(actual), k)
    score
}

benchmark(replications=10,
          apk(5000, actual, predicted),
          apk2(5000, actual, predicted),
          columns= c("test", "replications", "elapsed", "relative"))

#                            test replications elapsed relative
# 1  apk(5000, actual, predicted)           10  62.194 2961.619
# 2 apk2(5000, actual, predicted)           10   0.021    1.000

identical(apk(5000, actual, predicted),
          apk2(5000, actual, predicted))
# [1] TRUE
Run Code Online (Sandbox Code Playgroud)