Python [pandas]:按另一个数据帧的索引选择某些行

giu*_*deb 9 python dataframe pandas

我有一个数据帧,我只选择包含索引值的行到df1.index.

例如:

In [96]: df
Out[96]:
   A  B  C  D
1  1  4  9  1
2  4  5  0  2
3  5  5  1  0
22 1  3  9  6
Run Code Online (Sandbox Code Playgroud)

和这些索引

In[96]:df1.index
Out[96]:
Int64Index([  1,   3,   4,   5,   6,   7,  22,  28,  29,  32,], dtype='int64', length=253)
Run Code Online (Sandbox Code Playgroud)

我想要这个输出:

In [96]: df
Out[96]:
   A  B  C  D
1  1  4  9  1
3  5  5  1  0
22 1  3  9  6
Run Code Online (Sandbox Code Playgroud)

谢谢

jez*_*ael 23

用途isin:

df = df[df.index.isin(df1.index)]
Run Code Online (Sandbox Code Playgroud)

或者获取所有交叉索引并选择loc:

df = df.loc[df.index & df1.index]
df = df.loc[np.intersect1d(df.index, df1.index)]
df = df.loc[df.index.intersection(df1.index)]
Run Code Online (Sandbox Code Playgroud)
print (df)
    A  B  C  D
1   1  4  9  1
3   5  5  1  0
22  1  3  9  6
Run Code Online (Sandbox Code Playgroud)

编辑:

我尝试了解决方案:df = df.loc [df1.index].你认为这个解决方案是正确的吗?

解决方案不正确:

df = df.loc[df1.index]
print (df)

      A    B    C    D
1   1.0  4.0  9.0  1.0
3   5.0  5.0  1.0  0.0
4   NaN  NaN  NaN  NaN
5   NaN  NaN  NaN  NaN
6   NaN  NaN  NaN  NaN
7   NaN  NaN  NaN  NaN
22  1.0  3.0  9.0  6.0
28  NaN  NaN  NaN  NaN
29  NaN  NaN  NaN  NaN
32  NaN  NaN  NaN  NaN
C:/Dropbox/work-joy/so/_t/t.py:23: FutureWarning: 
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
  print (df)
Run Code Online (Sandbox Code Playgroud)

  • @song0089 - 是的,所以使用 `df = df[df.index.isin(df1.index)]` (2认同)
  • 请注意“isin()”的这种用法,因为它**不会**导致“df”和“df1”始终处于相同的顺序。 (2认同)

Spc*_*ond 11

现在可以将索引传递给 .loc 的行索引器/切片器,您只需要确保也指定列,即:

df = df.loc[df1.index, :]  # works
Run Code Online (Sandbox Code Playgroud)

并不是

df = df.loc[df1.index] # won't work
Run Code Online (Sandbox Code Playgroud)

IMO 这与 .loc 的预期用法更简洁/一致

  • 该解决方案的优点是确保 df 和 df1 的顺序相同。 (2认同)
  • 仅当“df1”的整个索引包含在“df”的索引中时,这才有效;接受的答案没有这个限制 (2认同)