numpy中的重要数字四舍五入

dmo*_*mon 20 python numpy

我试过搜索这个并没有找到满意的答案.

我想取一个列表/数组并将它们全部舍入到n个有效数字.我写了一个函数来做这个,但我想知道是否有一个标准的方法呢?我已经搜索但找不到它.例:

In:  [  0.0, -1.2366e22, 1.2544444e-15, 0.001222 ], n=2
Out: [ 0.00, -1.24e22,        1.25e-15,  1.22e-3 ]
Run Code Online (Sandbox Code Playgroud)

谢谢

Sea*_*ake 12

首先是批评:你在计算重要数字的数量是错误的.在您的示例中,您希望n = 3,而不是2.

如果使用使该算法的二进制版本简单的函数:frexp,则可以通过让numpy库函数处理它们来绕过大多数边缘情况.作为奖励,该算法也将运行得更快,因为它从不调用日志功能.

#The following constant was computed in maxima 5.35.1 using 64 bigfloat digits of precision
__logBase10of2 = 3.010299956639811952137388947244930267681898814621085413104274611e-1

import numpy as np

def RoundToSigFigs_fp( x, sigfigs ):
    """
    Rounds the value(s) in x to the number of significant figures in sigfigs.
    Return value has the same type as x.

    Restrictions:
    sigfigs must be an integer type and store a positive value.
    x must be a real value.
    """
    if not ( type(sigfigs) is int or type(sigfigs) is long or
             isinstance(sigfigs, np.integer) ):
        raise TypeError( "RoundToSigFigs_fp: sigfigs must be an integer." )

    if sigfigs <= 0:
        raise ValueError( "RoundToSigFigs_fp: sigfigs must be positive." )

    if not np.isreal( x ):
        raise TypeError( "RoundToSigFigs_fp: x must be real." )

    xsgn = np.sign(x)
    absx = xsgn * x
    mantissa, binaryExponent = np.frexp( absx )

    decimalExponent = __logBase10of2 * binaryExponent
    omag = np.floor(decimalExponent)

    mantissa *= 10.0**(decimalExponent - omag)

    if mantissa < 1.0:
        mantissa *= 10.0
        omag -= 1.0

    return xsgn * np.around( mantissa, decimals=sigfigs - 1 ) * 10.0**omag
Run Code Online (Sandbox Code Playgroud)

它可以正确处理所有情况,包括无限,nan,0.0和次正规数:

>>> eglist = [  0.0, -1.2366e22, 1.2544444e-15, 0.001222, 0.0, 
...        float("nan"), float("inf"), float.fromhex("0x4.23p-1028"), 
...        0.5555, 1.5444, 1.72340, 1.256e-15, 10.555555  ]
>>> eglist
[0.0, -1.2366e+22, 1.2544444e-15, 0.001222, 0.0, 
nan, inf, 1.438203867284623e-309, 
0.5555, 1.5444, 1.7234, 1.256e-15, 10.555555]
>>> RoundToSigFigs(eglist, 3)
array([  0.00000000e+000,  -1.24000000e+022,   1.25000000e-015,
         1.22000000e-003,   0.00000000e+000,               nan,
                     inf,   1.44000000e-309,   5.56000000e-001,
         1.54000000e+000,   1.72000000e+000,   1.26000000e-015,
         1.06000000e+001])
>>> RoundToSigFigs(eglist, 1)
array([  0.00000000e+000,  -1.00000000e+022,   1.00000000e-015,
         1.00000000e-003,   0.00000000e+000,               nan,
                     inf,   1.00000000e-309,   6.00000000e-001,
         2.00000000e+000,   2.00000000e+000,   1.00000000e-015,
         1.00000000e+001])
Run Code Online (Sandbox Code Playgroud)

编辑:2016/10/12我发现原始代码处理错误的边缘情况.我已经在GitHub存储库中放置了更完整的代码版本.

编辑:2019/03/01替换为重新编码的版本.

  • 常量`__logBase10of2`的精度远远高于float,而python解释器将截断大部分精度. (2认同)
  • 确实如此。问题是浮点精度可能会在某个时刻发生变化,这提供了少量的未来证明。更好的是使用 Python 的十进制包及其可调精度,在转换为浮点之前计算它。 (2认同)

Sco*_*nte 10

测试所有已经提出的解决方案,我发现它们要么

  1. 与字符串相互转换,效率低下
  2. 无法处理负数
  3. 无法处理数组
  4. 有一些数字错误。

这是我尝试解决所有这些问题的解决方案。(编辑 2020-03-18:np.asarray按照 A. West 的建议添加。)

def signif(x, p):
    x = np.asarray(x)
    x_positive = np.where(np.isfinite(x) & (x != 0), np.abs(x), 10**(p-1))
    mags = 10 ** (p - 1 - np.floor(np.log10(x_positive)))
    return np.round(x * mags) / mags
Run Code Online (Sandbox Code Playgroud)

测试:

def scottgigante(x, p):
    x_positive = np.where(np.isfinite(x) & (x != 0), np.abs(x), 10**(p-1))
    mags = 10 ** (p - 1 - np.floor(np.log10(x_positive)))
    return np.round(x * mags) / mags

def awest(x,p):
    return float(f'%.{p-1}e'%x)

def denizb(x,p):
    return float(('%.' + str(p-1) + 'e') % x)

def autumn(x, p):
    return np.format_float_positional(x, precision=p, unique=False, fractional=False, trim='k')

def greg(x, p):
    return round(x, -int(np.floor(np.sign(x) * np.log10(abs(x)))) + p-1)

def user11336338(x, p):         
    xr = (np.floor(np.log10(np.abs(x)))).astype(int)
    xr=10.**xr*np.around(x/10.**xr,p-1)   
    return xr

def dmon(x, p):
    if np.all(np.isfinite(x)):
        eset = np.seterr(all='ignore')
        mags = 10.0**np.floor(np.log10(np.abs(x)))  # omag's
        x = np.around(x/mags,p-1)*mags             # round(val/omag)*omag
        np.seterr(**eset)
        x = np.where(np.isnan(x), 0.0, x)           # 0.0 -> nan -> 0.0
    return x

def seanlake(x, p):
    __logBase10of2 = 3.010299956639811952137388947244930267681898814621085413104274611e-1
    xsgn = np.sign(x)
    absx = xsgn * x
    mantissa, binaryExponent = np.frexp( absx )

    decimalExponent = __logBase10of2 * binaryExponent
    omag = np.floor(decimalExponent)

    mantissa *= 10.0**(decimalExponent - omag)

    if mantissa < 1.0:
        mantissa *= 10.0
        omag -= 1.0

    return xsgn * np.around( mantissa, decimals=p - 1 ) * 10.0**omag

solns = [scottgigante, awest, denizb, autumn, greg, user11336338, dmon, seanlake]

xs = [
    1.114, # positive, round down
    1.115, # positive, round up
    -1.114, # negative
    1.114e-30, # extremely small
    1.114e30, # extremely large
    0, # zero
    float('inf'), # infinite
    [1.114, 1.115e-30], # array input
]
p = 3

print('input:', xs)
for soln in solns:
    print(f'{soln.__name__}', end=': ')
    for x in xs:
        try:
            print(soln(x, p), end=', ')
        except Exception as e:
            print(type(e).__name__, end=', ')
    print()
Run Code Online (Sandbox Code Playgroud)

结果:

input: [1.114, 1.115, -1.114, 1.114e-30, 1.114e+30, 0, inf, [1.114, 1.115e-30]]
scottgigante: 1.11, 1.12, -1.11, 1.11e-30, 1.11e+30, 0.0, inf, [1.11e+00 1.12e-30], 
awest: 1.11, 1.11, -1.11, 1.11e-30, 1.11e+30, 0.0, inf, TypeError, 
denizb: 1.11, 1.11, -1.11, 1.11e-30, 1.11e+30, 0.0, inf, TypeError, 
autumn: 1.11, 1.11, -1.11, 0.00000000000000000000000000000111, 1110000000000000000000000000000., 0.00, inf, TypeError, 
greg: 1.11, 1.11, -1.114, 1.11e-30, 1.11e+30, ValueError, OverflowError, TypeError, 
user11336338: 1.11, 1.12, -1.11, 1.1100000000000002e-30, 1.1100000000000001e+30, nan, nan, [1.11e+00 1.12e-30], 
dmon: 1.11, 1.12, -1.11, 1.1100000000000002e-30, 1.1100000000000001e+30, 0.0, inf, [1.11e+00 1.12e-30], 
seanlake: 1.11, 1.12, -1.11, 1.1100000000000002e-30, 1.1100000000000001e+30, 0.0, inf, ValueError, 
Run Code Online (Sandbox Code Playgroud)

定时:

def test_soln(soln):
    try:
        soln(np.linspace(1, 100, 1000), 3)
    except Exception:
        [soln(x, 3) for x in np.linspace(1, 100, 1000)]

for soln in solns:
    print(soln.__name__)
    %timeit test_soln(soln)
Run Code Online (Sandbox Code Playgroud)

结果:

scottgigante
135 µs ± 15.3 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
awest
2.23 ms ± 430 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
denizb
2.18 ms ± 352 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
autumn
2.92 ms ± 206 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
greg
14.1 ms ± 1.21 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
user11336338
157 µs ± 50.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
dmon
142 µs ± 8.52 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
seanlake
20.7 ms ± 994 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Run Code Online (Sandbox Code Playgroud)

  • 喜欢这个!但是,您的方法仅适用于查询中的 numpy 数组:“x!=0”。也许添加 `if not isinstance(x, np.ndarray): x = np.array(x)` (2认同)

And*_*ker 5

numpy.set_printoptions你在找什么?

import numpy as np
np.set_printoptions(precision=2)
print np.array([  0.0, -1.2366e22, 1.2544444e-15, 0.001222 ])
Run Code Online (Sandbox Code Playgroud)

得到:

[  0.00e+00  -1.24e+22   1.25e-15   1.22e-03]
Run Code Online (Sandbox Code Playgroud)

编辑:

如果您尝试转换数据,numpy.around似乎可以解决此问题的各个方面.但是,在指数为负数的情况下,它不会执行您想要的操作.

  • @Simon.这是一个简单的简单问题."有没有一种标准方法可以将m个数字舍入为n个有效数字?" 一个完美的答案是"没有没有"!! 我真的不知道你想要做什么 (2认同)

Aut*_*umn 5

这里给出的大多数解决方案要么(a)没有给出正确的有效数字,要么(b)不必要地复杂。

如果您的目标是显示格式,则numpy.format_float_positional直接支持所需的行为。以下片段返回x格式化为 4 位有效数字的浮点数,并取消了科学记数法。

import numpy as np
x=12345.6
np.format_float_positional(x, precision=4, unique=False, fractional=False, trim='k')
> 12340.
Run Code Online (Sandbox Code Playgroud)

  • 完美的答案:) (2认同)