获取具有对象或分类数据类型的列名列表

Question

获取具有对象或分类数据类型的列名列表

use*_*916 1 python types dataframe pandas

我的目标是获得一个列表对象：['assetCode', 'assetName']，其中的内容是Panda.series根据多个条件检索到的标签。我试过：

tmp3 = datatype[datatype == 'object' | datatype == 'category'].index # extract label from Pandas.series

Run Code Online (Sandbox Code Playgroud)

这给出了错误： TypeError: cannot compare a dtyped [object] array with a scalar of type [bool]

然而，虽然不太优雅，但我能够找到以下两个可行的解决方案：

tmp2 = datatype[datatype == 'object'].index # extract label from Pandas.series
tmp2[0]
'assetCode'


tmp1 = datatype[datatype == 'category'].index # extract label from Pandas.series
tmp1[0]
'assetName'

Run Code Online (Sandbox Code Playgroud)

如何将这两个字符串组合成一个列表对象？有没有比我尝试的方式更好的方法来实现这个目标？

Answer 1

cs9*_*s95 5

设置

df

   A  B  C
0  8  4  2
1  8  8  6
2  8  5  2

datatype = df.dtypes
datatype

A      object
B    category
C       int64
dtype: object

Run Code Online (Sandbox Code Playgroud)

看起来您正在尝试从某些 DataFrame（此处未显示）中选择对象和分类列。要修复您的代码，请使用：

tmp3 = datatype[(datatype == 'object') | (datatype == 'category')].index.tolist()
tmp3
#  ['A', 'B']

Run Code Online (Sandbox Code Playgroud)

由于按位运算符具有更高的优先级，因此您需要在对掩码进行 OR 运算之前使用括号。之后，索引工作正常。

要获取列表，请致电.index.tolist()。

另一个解决方案是select_dtypes：

df.select_dtypes(include=['object', 'category'])

   A  B
0  8  4
1  8  8
2  8  5

df.select_dtypes(include=['object', 'category']).columns
# ['A', 'B']

Run Code Online (Sandbox Code Playgroud)

这避免了对中间datatype系列的需要。

归档时间：	6 年，11 月前
查看次数：	2469 次
最近记录：	6 年，11 月前