Python numpy 百分位数 vs scipy 百分位数

Question

Python numpy 百分位数 vs scipy 百分位数

我对自己做错了什么感到困惑。

我有以下代码：

import numpy as np
from scipy import stats

df
Out[29]: array([66., 69., 67., 75., 69., 69.])

val = 73.94
z1 = stats.percentileofscore(df, val)
print(z1)
Out[33]: 83.33333333333334

np.percentile(df, z1)
Out[34]: 69.999999999

Run Code Online (Sandbox Code Playgroud)

我期待那np.percentile(df, z1)会给我回报val = 73.94

Answer 1

use*_*203 5

我认为你不太明白什么percentileofscore和percentile 实际做什么。它们不是互逆的。

从文档中scipy.stats.percentileofscore：

分数相对于分数列表的百分位数排名。

percentileofscore例如，80% 的A表示 a 中 80% 的分数低于给定分数。对于差距或联系，确切的定义取决于可选关键字 kind。

因此，当您提供值时73.94，其中5的某些元素会df低于该分数，并5/6给出您的83.3333%结果。

现在在注释中numpy.percentile：

给定长度为 N 的向量 V，V 的第 q 个百分位数是 V 的排序副本中从最小值到最大值的值 q/100。

默认interpolation参数是'linear'这样的：

'线性': i + (j - i) * fraction，其中fraction是i和j包围的索引的小数部分。

由于您已提供83输入参数，因此您正在查看83/100数组中从最小值到最大值的值。

如果您有兴趣深入研究源代码，可以在此处找到它，但这里是对此处进行的计算的简化查看：

ap = np.asarray(sorted(df))
Nx = df.shape[0]

indices = z1 / 100 * (Nx - 1)
indices_below = np.floor(indices).astype(int)
indices_above = indices_below + 1

weight_above = indices - indices_below
weight_below = 1 - weight_above

x1 = ap[b] * weight_below   # 57.50000000000004
x2 = ap[a] * weight_above   # 12.499999999999956

x1 + x2

Run Code Online (Sandbox Code Playgroud)

70.0

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，9 月前
查看次数：	5485 次
最近记录：	6 年，9 月前