lap*_*ita 2 python numpy scientific-computing pandas data-science
我正在做一个数据分析项目,其中处理的数据非常大。我最初用纯 python 做了所有事情,但现在尝试用 numpy 和 pandas 来做。然而,我似乎遇到了障碍,因为不可能在 numpy 中处理大于 64 位的整数(如果我在 numpy 中使用 python 整数,它们的最大值为 9223372036854775807)。我是否完全抛弃 numpy 和 pandas 还是有办法将它们与 python 风格的任意大整数一起使用?我对性能受到影响没关系。
默认情况下 numpy 将元素保留为数字数据类型。但您可以强制输入对象,如下所示
import numpy as np
x = np.array([10,20,30,40], dtype=object)
x_exp2 = 1000**x
print(x_exp2)
Run Code Online (Sandbox Code Playgroud)
输出是
[1000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000]
Run Code Online (Sandbox Code Playgroud)
缺点是执行速度慢很多。
稍后编辑以显示 np.sum() 有效。当然可能存在一些限制。
import numpy as np
x = np.array([10,20,30,40], dtype=object)
x_exp2 = 1000**x
print(x_exp2)
print(np.sum(x_exp2))
print(np.prod(x_exp2))
Run Code Online (Sandbox Code Playgroud)
输出是:
[1000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000]
1000000000000000000000000000001000000000000000000000000000001000000000000000000000000000001000000000000000000000000000000
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
703 次 |
| 最近记录: |