从Pandas DataFrame中设置索引

Question

从Pandas DataFrame中设置索引

我有一个带有列的DataFrame [A, B, C, D, E, F, G, H].

已使用列创建索引[D, G, H]:

>>> print(dgh_columns)
Index(['D', 'G', 'H'], dtype='object')

Run Code Online (Sandbox Code Playgroud)

如何在没有列的情况下检索原始DataFrame D, G, H？

是否有索引子集操作？

理想情况下,这将是:

df[df.index - dgh_columns]

Run Code Online (Sandbox Code Playgroud)

但这似乎不起作用

Answer 1

jez*_*ael 5

我想你可以用Index.difference:

df[df.columns.difference(dgh_columns)]

Run Code Online (Sandbox Code Playgroud)

样品:

df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[7,8,9],
                   'F':[1,3,5],
                   'G':[5,3,6],
                   'H':[7,4,3]})

print (df)
   A  B  C  D  E  F  G  H
0  1  4  7  1  7  1  5  7
1  2  5  8  3  8  3  3  4
2  3  6  9  5  9  5  6  3

dgh_columns = pd.Index(['D', 'G', 'H'])
print (df[df.columns.difference(dgh_columns)])
   A  B  C  E  F
0  1  4  7  7  1
1  2  5  8  8  3
2  3  6  9  9  5

Run Code Online (Sandbox Code Playgroud)

Numpy解决方案带numpy.setxor1d或numpy.setdiff1d:

dgh_columns = pd.Index(['D', 'G', 'H'])
print (df[np.setxor1d(df.columns, dgh_columns)])
   A  B  C  E  F
0  1  4  7  7  1
1  2  5  8  8  3
2  3  6  9  9  5

Run Code Online (Sandbox Code Playgroud)

dgh_columns = pd.Index(['D', 'G', 'H'])
print (df[np.setdiff1d(df.columns, dgh_columns)])
   A  B  C  E  F
0  1  4  7  7  1
1  2  5  8  8  3
2  3  6  9  9  5

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年前
查看次数：	254 次
最近记录：	9 年前