我有一个numpy脚本,它在以下代码中占用了大约50%的运行时间:
s = numpy.dot(v1, v1)
哪里
v1 = v[1:]
和v是一个4000元件1D ndarray的float64存储在连续的存储器(v.strides是(8,)).
有什么建议加快这个?
编辑这是在Intel硬件上.这是我的输出numpy.show_config():
atlas_threads_info:
libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
library_dirs = ['/usr/local/atlas-3.9.16/lib']
language = f77
include_dirs = ['/usr/local/atlas-3.9.16/include']
blas_opt_info:
libraries = ['ptf77blas', 'ptcblas', 'atlas']
library_dirs = ['/usr/local/atlas-3.9.16/lib']
define_macros = [('ATLAS_INFO', '"\\"3.9.16\\""')]
language = c
include_dirs = ['/usr/local/atlas-3.9.16/include']
atlas_blas_threads_info:
libraries = ['ptf77blas', 'ptcblas', 'atlas']
library_dirs = ['/usr/local/atlas-3.9.16/lib']
language = c
include_dirs = ['/usr/local/atlas-3.9.16/include']
lapack_opt_info:
libraries = …Run Code Online (Sandbox Code Playgroud)