我有一组n个向量存储在3 xn矩阵中z.我找到了外用产品np.einsum.当我使用时间计时:
%timeit v=np.einsum('i...,j...->ij...',z,z)
Run Code Online (Sandbox Code Playgroud)
我得到了结果:
The slowest run took 7.23 times longer than the fastest. This could mean that an
intermediate result is being cached
100000 loops, best of 3: 2.9 µs per loop
Run Code Online (Sandbox Code Playgroud)
这里发生了什么,可以避免吗?最好的3是2.9us,但最慢可能更典型.
我有一个2D Numpy数组,我想放入一个pandas系列(不是DataFrame):
>>> import pandas as pd
>>> import numpy as np
>>> a = np.zeros((5, 2))
>>> a
array([[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.]])
Run Code Online (Sandbox Code Playgroud)
但这会引发错误:
>>> s = pd.Series(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/miniconda/envs/pyspark/lib/python3.4/site-packages/pandas/core/series.py", line 227, in __init__
raise_cast_failure=True)
File "/miniconda/envs/pyspark/lib/python3.4/site-packages/pandas/core/series.py", line 2920, in _sanitize_array
raise Exception('Data must be 1-dimensional')
Exception: Data must be 1-dimensional
Run Code Online (Sandbox Code Playgroud)
有可能是黑客:
>>> s = pd.Series(map(lambda x:[x], a)).apply(lambda …Run Code Online (Sandbox Code Playgroud)