如何在python中计算正态分布百分点函数

SAR*_*ose 4 python math statistics numpy scipy

如何在scipy.stats.norm.ppf不使用Scipy 的情况下完成相应的操作.我有erf内置的python的Math模块,但我似乎无法重新创建该函数.

PS:我不能只使用scipy,因为Heroku不允许你安装它并使用备用buildpacks违反300Mb最大slug大小限制.

小智 7

函数 ppf 是 y = (1+erf(x/sqrt(2))/2 的倒数。所以我们需要为 x 求解这个方程,给定 y 介于 0 和 1 之间。这是一个通过二分法实现的代码方法. 我导入了 SciPy 函数来说明结果是一样的。

from math import erf, sqrt
from scipy.stats import norm         # only for comparison
y = 0.123  

z = 2*y-1
a = 0
while erf(a) > z or erf(a+1) < z:    # looking for initial bracket of size 1
    if erf(a) > z:
        a -= 1
    else:
        a += 1
b = a+1                              # found a bracket, proceed to refine it
while b-a > 1e-15:                   # 1e-15 ought to be enough precision 
   c = (a+b)/2.0                     # bisection method
   if erf(c) > z:
       b = c
   else:
       a = c

print sqrt(2)*(a+b)/2.0              # this is the answer 
print norm.ppf(y)                    # SciPy for comparison
Run Code Online (Sandbox Code Playgroud)

留给你做的:

  • 初步边界检查(y 必须介于 0 和 1 之间)
  • 如果需要其他均值/方差,则缩放和移动;代码用于标准正态分布(均值为 0,方差为 1)。


K. *_*uhr 6

没有一种简单的方法可以erf用来实现,norm.ppf因为norm.ppf它与的相关erf.相反,这是一个纯Python的代码实现scipy.您应该发现函数ndtri返回的值完全相同norm.ppf:

import math

s2pi = 2.50662827463100050242E0

P0 = [
    -5.99633501014107895267E1,
    9.80010754185999661536E1,
    -5.66762857469070293439E1,
    1.39312609387279679503E1,
    -1.23916583867381258016E0,
]

Q0 = [
    1,
    1.95448858338141759834E0,
    4.67627912898881538453E0,
    8.63602421390890590575E1,
    -2.25462687854119370527E2,
    2.00260212380060660359E2,
    -8.20372256168333339912E1,
    1.59056225126211695515E1,
    -1.18331621121330003142E0,
]

P1 = [
    4.05544892305962419923E0,
    3.15251094599893866154E1,
    5.71628192246421288162E1,
    4.40805073893200834700E1,
    1.46849561928858024014E1,
    2.18663306850790267539E0,
    -1.40256079171354495875E-1,
    -3.50424626827848203418E-2,
    -8.57456785154685413611E-4,
]

Q1 = [
    1,
    1.57799883256466749731E1,
    4.53907635128879210584E1,
    4.13172038254672030440E1,
    1.50425385692907503408E1,
    2.50464946208309415979E0,
    -1.42182922854787788574E-1,
    -3.80806407691578277194E-2,
    -9.33259480895457427372E-4,
]

P2 = [
    3.23774891776946035970E0,
    6.91522889068984211695E0,
    3.93881025292474443415E0,
    1.33303460815807542389E0,
    2.01485389549179081538E-1,
    1.23716634817820021358E-2,
    3.01581553508235416007E-4,
    2.65806974686737550832E-6,
    6.23974539184983293730E-9,
]

Q2 = [
    1,
    6.02427039364742014255E0,
    3.67983563856160859403E0,
    1.37702099489081330271E0,
    2.16236993594496635890E-1,
    1.34204006088543189037E-2,
    3.28014464682127739104E-4,
    2.89247864745380683936E-6,
    6.79019408009981274425E-9,
]

def ndtri(y0):
    if y0 <= 0 or y0 >= 1:
        raise ValueError("ndtri(x) needs 0 < x < 1")
    negate = True
    y = y0
    if y > 1.0 - 0.13533528323661269189:
        y = 1.0 - y
        negate = False

    if y > 0.13533528323661269189:
        y = y - 0.5
        y2 = y * y
        x = y + y * (y2 * polevl(y2, P0) / polevl(y2, Q0))
        x = x * s2pi
        return x

    x = math.sqrt(-2.0 * math.log(y))
    x0 = x - math.log(x) / x

    z = 1.0 / x
    if x < 8.0:
        x1 = z * polevl(z, P1) / polevl(z, Q1)
    else:
        x1 = z * polevl(z, P2) / polevl(z, Q2)
    x = x0 - x1
    if negate:
        x = -x
    return x

def polevl(x, coef):
    accum = 0
    for c in coef:
        accum = x * accum + c
    return accum
Run Code Online (Sandbox Code Playgroud)

  • 它们是预先计算的值,用于以最小的计算获得最准确的值.你可以找到SciPy使用的原始代码[这里](https://raw.githubusercontent.com/scipy/scipy/2526df72e5d4ca8bad6e2f4b3cbdfbc33e805865/scipy/special/cephes/ndtri.c); 它包含了对其中一些含义的评论,但你并不应该考虑它们的含义. (2认同)