假设我有两个具有多索引的数据帧,其中一个索引比另一个更深。现在我只想从一个(更深的)数据框中选择那些行,其中它们的部分索引包含在另一个数据框中。
输入示例:
df = pandas.DataFrame(
{
"A": ["a1", "a1", "a1", "a2", "a2", "a2"],
"B": ["b1", "b1", "b2", "b1", "b2", "b2"],
"C": ["c1", "c2", "c1", "c1", "c1", "c2"],
"V": [1, 2, 3, 4, 5, 6],
}
).set_index(["A", "B", "C"])
df2 = pandas.DataFrame(
{
"A": ["a1", "a1", "a2", "a2"],
"B": ["b1", "b3", "b1", "b3"],
"X": [1, 2, 3, 4]
}
).set_index(["A", "B"])
Run Code Online (Sandbox Code Playgroud)
视觉的:
V
A B C
a1 b1 c1 1
c2 2
b2 c1 3
a2 b1 c1 4
b2 c1 …Run Code Online (Sandbox Code Playgroud) 我正在计算scipy.sparse矩阵(CSC)和numpy ndarray向量之间的点积:
>>> print type(np_vector), np_vector.shape
<type 'numpy.ndarray'> (200,)
>>> print type(sp_matrix), sparse.isspmatrix(sp_matrix), sp_matrix.shape
<class 'scipy.sparse.csc.csc_matrix'> True (200, 200)
>>> dot_vector = dot(np_vector, sp_matrix)
Run Code Online (Sandbox Code Playgroud)
结果似乎是一个新的ndarray矢量,正如我所料:
>>> print type(dot_vector), dot_vector.shape
<type 'numpy.ndarray'> (200,)
Run Code Online (Sandbox Code Playgroud)
但是当我尝试向该向量添加标量时,我收到异常:
>>> scalar = 3.0
>>> print dot_vector + scalar
C:\Python27\lib\site-packages\scipy\sparse\compressed.pyc in __add__(self, other)
173 return self.copy()
174 else: # Now we would add this scalar to every element.
--> 175 raise NotImplementedError('adding a nonzero scalar to a '
176 'sparse matrix is not supported')
177 elif isspmatrix(other): …Run Code Online (Sandbox Code Playgroud)