我有数据数据存储在两个DataFrames x和y中.来自numpy的内部产品起作用,但来自熊猫的点产品却没有.
In [63]: x.shape
Out[63]: (1062, 36)
In [64]: y.shape
Out[64]: (36, 36)
In [65]: np.inner(x, y).shape
Out[65]: (1062L, 36L)
In [66]: x.dot(y)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-66-76c015be254b> in <module>()
----> 1 x.dot(y)
C:\Programs\WinPython-64bit-2.7.3.3\python-2.7.3.amd64\lib\site-packages\pandas\core\frame.pyc in dot(self, other)
888 if (len(common) > len(self.columns) or
889 len(common) > len(other.index)):
--> 890 raise ValueError('matrices are not aligned')
891
892 left = self.reindex(columns=common, copy=False)
ValueError: matrices are not aligned
Run Code Online (Sandbox Code Playgroud)
这是一个错误还是我使用熊猫错了?
unu*_*tbu 34
不仅必须形状x和y正确,而且列名x必须与索引名称匹配y.否则此代码pandas/core/frame.py将引发ValueError:
if isinstance(other, (Series, DataFrame)):
common = self.columns.union(other.index)
if (len(common) > len(self.columns) or
len(common) > len(other.index)):
raise ValueError('matrices are not aligned')
Run Code Online (Sandbox Code Playgroud)
如果您只想计算矩阵乘积而不使列名x匹配索引名y,则使用NumPy点函数:
np.dot(x, y)
Run Code Online (Sandbox Code Playgroud)
为什么列名的理由x必须的目录名称匹配y是因为大熊猫dot方法将重新索引x,并y使得如果列顺序x与索引顺序y不自然匹配,他们将作出匹配在进行矩阵产品之前:
left = self.reindex(columns=common, copy=False)
right = other.reindex(index=common, copy=False)
Run Code Online (Sandbox Code Playgroud)
NumPy dot功能没有这样的功能.它将根据底层数组中的值计算矩阵乘积.
这是一个重现错误的示例:
import pandas as pd
import numpy as np
columns = ['col{}'.format(i) for i in range(36)]
x = pd.DataFrame(np.random.random((1062, 36)), columns=columns)
y = pd.DataFrame(np.random.random((36, 36)))
print(np.dot(x, y).shape)
# (1062, 36)
print(x.dot(y).shape)
# ValueError: matrices are not aligned
Run Code Online (Sandbox Code Playgroud)