如何编写函数来计算 R 中的 H 指数？

Question

如何编写函数来计算 R 中的 H 指数？

我是 R 的新手，正在寻找计算 h 指数。

H指数是量化科学生产力的流行指标。形式上，如果f是对应于每个出版物的引用次数的函数，我们计算 h 指数如下：

首先，我们将值f从最大值到最小值排序。然后，我们寻找最后一个f大于或等于该位置的位置（我们称这个位置为 h）。

例如，如果我们有一位研究人员发表了 5 篇论文 A、B、C、D 和 E，分别有 10、8、5、4 和 3 次引用，那么 h 指数等于 4，因为第 4 次发表有 4 次引用而第 5 篇只有 3。相反，如果相同的出版物有 25、8、5、3 和 3 次引用，那么索引为 3，因为第四篇论文只有 3 次引用。

谁能建议更聪明的方法来解决这个问题

a <- c(10,8,5,4,3)

Run Code Online (Sandbox Code Playgroud)

我期望 h 索引值的输出为 4。

Answer 1

Gre*_*gor 6

假设输入已经排序，我会使用这个：

tail(which(a >= seq_along(a)), 1)
# [1] 4

Run Code Online (Sandbox Code Playgroud)

当然，你可以把它放在一个小函数中：

h_index = function(cites) {
  if(max(cites) == 0) return(0) # assuming this is reasonable
  cites = cites[order(cites, decreasing = TRUE)]
  tail(which(cites >= seq_along(cites)), 1)
}

a1 = c(10,8, 5, 4, 3)
a2 = c(10, 9, 7, 1, 1)

h_index(a1)
# [1] 4

h_index(a2)
# [1] 3

h_index(1)
# [1] 1

## set this to be 0, not sure if that's what you want
h_index(0)
# [1] 0

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，8 月前
查看次数：	666 次
最近记录：	4 年，10 月前