在收费后丢失了numpy阵列的精度

Question

在收费后丢失了numpy阵列的精度

我有一个numpy数组,其中每个数字都有一定的指定精度(使用around(x,1).

[[     3.   15294.7  32977.7   4419.5    978.4    504.4    123.6]
 [     4.   14173.8  31487.2   3853.9    967.8    410.2    107.1]
 [     5.   15323.5  34754.5   3738.7   1034.7    376.1    105.5]
 [     6.   17396.7  41164.5   3787.4   1103.2    363.9    109.4]
 [     7.   19665.5  48967.6   3900.9   1161.     362.1    115.8]
 [     8.   21839.8  56922.5   4037.4   1208.2    365.9    123.5]
 [     9.   23840.6  64573.8   4178.1   1247.     373.2    131.9]
 [    10.   25659.9  71800.2   4314.8   1279.5    382.7    140.5]
 [    11.   27310.3  78577.7   4444.3   1307.1    393.7    149.1]
 [    12.   28809.1  84910.4   4565.8   1331.     405.5    157.4]]

Run Code Online (Sandbox Code Playgroud)

我正在尝试将每个数字转换为字符串,以便我可以使用python-docx将它们写入字表.但是tolist()函数的结果是一团糟.数字的精度会丢失,导致输出很长.

[['3.0',
  '15294.7001953',
  '32977.6992188',
  '4419.5',
  '978.400024414',
  '504.399993896',
  '123.599998474'],
 ['4.0',
  '14173.7998047',
  '31487.1992188',
  '3853.89990234',
  '967.799987793',
  '410.200012207',
  '107.099998474'],
.......

Run Code Online (Sandbox Code Playgroud)

除了tolist()函数之外,我还尝试了[[str(e)for a in a] for a m].结果是一样的.这非常烦人.如何在保持精度的同时轻松转换为字符串？谢谢!

Answer 1

ev-*_*-br 5

转换为字符串时出了点问题。仅用数字：

>>> import numpy as np
>>> a = np.random.random(10)*30
>>> a
array([ 27.30713434,  10.25895255,  19.65843272,  23.93161555,
        29.08479175,  25.69713898,  11.90236158,   5.41050686,
        18.16481691,  14.12808414])
>>> 
>>> b = np.round(a, decimals=1)
>>> b
array([ 27.3,  10.3,  19.7,  23.9,  29.1,  25.7,  11.9,   5.4,  18.2,  14.1])
>>> b.tolist()
[27.3, 10.3, 19.7, 23.9, 29.1, 25.7, 11.9, 5.4, 18.2, 14.1]

Run Code Online (Sandbox Code Playgroud)

请注意，np.round它不能就地工作：

>>> a
array([ 27.30713434,  10.25895255,  19.65843272,  23.93161555,
        29.08479175,  25.69713898,  11.90236158,   5.41050686,
        18.16481691,  14.12808414])

Run Code Online (Sandbox Code Playgroud)

如果您只需要将数字转换为字符串：

>>> " ".join(str(_) for _ in np.round(a, 1)) 
'27.3 10.3 19.7 23.9 29.1 25.7 11.9 5.4 18.2 14.1'

Run Code Online (Sandbox Code Playgroud)

编辑：显然，np.round不能很好地配合float32（其他答案为此提供了理由）。一个简单的解决方法是将数组显式转换为np.floator np.float64或just float：

>>> # prepare an array of float32 values
>>> a32  = (np.random.random(10) * 30).astype(np.float32)
>>> a32.dtype
dtype('float32')
>>> 
>>> # notice the use of .astype(np.float32)
>>> np.round(a32.astype(np.float64), 1)
array([  5.5,   8.2,  29.8,   8.6,  15.5,  28.3,   2. ,  24.5,  18.4,   8.3])
>>>

Run Code Online (Sandbox Code Playgroud)

EDIT2：正如Warren在他的回答中所展示的，字符串格式化实际上可以正确地四舍五入（尝试"%.1f" % (4.79,)）。因此，无需在浮点类型之间进行强制转换。我将主要留下我的答案，以提醒您，np.around在这种情况下使用不是正确的选择。

Answer 2

War*_*ser 5

精度不会"丢失"; 你从来没有精确到第一位.值15294.7无法用单精度精确表示(即np.float32); 最佳近似值是 15294.70019 ......:

In [1]: x = np.array([15294.7], dtype=np.float32)

In [2]: x
Out[2]: array([ 15294.70019531], dtype=float32)

Run Code Online (Sandbox Code Playgroud)

见http://floating-point-gui.de/

使用np.float64可以获得更好的近似值,但它仍然不能完全代表15294.7.

如果希望文本输出使用单个十进制数字格式化,请使用为格式化文本输出设计的函数,例如np.savetxt:

In [56]: x = np.array([[15294.7, 32977.7],[14173.8, 31487.2]], dtype=np.float32) 

In [57]: x
Out[57]: 
array([[ 15294.70019531,  32977.69921875],
       [ 14173.79980469,  31487.19921875]], dtype=float32)

In [58]: np.savetxt("data.txt", x, fmt="%.1f", delimiter=",")

In [59]: !cat data.txt
15294.7,32977.7
14173.8,31487.2

Run Code Online (Sandbox Code Playgroud)

如果你真的需要一个格式很好的字符串的numpy数组,你可以做这样的事情:

In [63]: def myfmt(r):
   ....:     return "%.1f" % (r,)
   ....: 

In [64]: vecfmt = np.vectorize(myfmt)

In [65]: vecfmt(x)
Out[65]: 
array([['15294.7', '32977.7'],
       ['14173.8', '31487.2']], 
      dtype='|S64')

Run Code Online (Sandbox Code Playgroud)

如果您使用这些方法中的任何一种,则无需先通过数据around; 舍入将作为格式化过程的一部分发生.

归档时间：	11 年，9 月前
查看次数：	2978 次
最近记录：	7 年，4 月前