Han*_*art 96 python types pandas
让我们说df是一个熊猫DataFrame.我想找到所有数字类型的列.就像是:
isNumeric = is_numeric(df)
Run Code Online (Sandbox Code Playgroud)
小智 115
您可以使用select_dtypesDataFrame的方法.它包括两个参数include和exclude.所以isNumeric看起来像:
numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']
newdf = df.select_dtypes(include=numerics)
Run Code Online (Sandbox Code Playgroud)
Kat*_*mar 61
您可以使用以下命令仅筛选数字列
df._get_numeric_data()
Run Code Online (Sandbox Code Playgroud)
例
In [32]: data
Out[32]:
A B
0 1 s
1 2 s
2 3 s
3 4 s
In [33]: data._get_numeric_data()
Out[33]:
A
0 1
1 2
2 3
3 4
Run Code Online (Sandbox Code Playgroud)
sta*_*010 52
创建仅包含数字列的新数据框的简单单行答案:
df.select_dtypes(include=np.number)
Run Code Online (Sandbox Code Playgroud)
如果需要数字列的名称:
df.select_dtypes(include=np.number).columns.tolist()
Run Code Online (Sandbox Code Playgroud)
完整代码:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': range(7, 10),
'B': np.random.rand(3),
'C': ['foo','bar','baz'],
'D': ['who','what','when']})
df
# A B C D
# 0 7 0.704021 foo who
# 1 8 0.264025 bar what
# 2 9 0.230671 baz when
df_numerics_only = df.select_dtypes(include=np.number)
df_numerics_only
# A B
# 0 7 0.704021
# 1 8 0.264025
# 2 9 0.230671
colnames_numerics_only = df.select_dtypes(include=np.number).columns.tolist()
colnames_numerics_only
# ['A', 'B']
Run Code Online (Sandbox Code Playgroud)
nim*_*ous 29
简单的单线:
df.select_dtypes('number').columns
Run Code Online (Sandbox Code Playgroud)
WeN*_*Ben 25
df.select_dtypes(exclude=['object'])
Run Code Online (Sandbox Code Playgroud)
小智 6
以下代码将返回数据集的数字列的名称列表。
cnames=list(marketing_train.select_dtypes(exclude=['object']).columns)
Run Code Online (Sandbox Code Playgroud)
这marketing_train是我的数据集,select_dtypes()是使用 exclude 和 include 参数选择数据类型的函数,列用于获取上述代码输出的数据集的列名,如下所示:
['custAge',
'campaign',
'pdays',
'previous',
'emp.var.rate',
'cons.price.idx',
'cons.conf.idx',
'euribor3m',
'nr.employed',
'pmonths',
'pastEmail']
Run Code Online (Sandbox Code Playgroud)
这是另一个用于在 pandas 数据框中查找数字列的简单代码,
numeric_clmns = df.dtypes[df.dtypes != "object"].index
Run Code Online (Sandbox Code Playgroud)
Han*_*art -1
def is_type(df, baseType):
import numpy as np
import pandas as pd
test = [issubclass(np.dtype(d).type, baseType) for d in df.dtypes]
return pd.DataFrame(data = test, index = df.columns, columns = ["test"])
def is_float(df):
import numpy as np
return is_type(df, np.float)
def is_number(df):
import numpy as np
return is_type(df, np.number)
def is_integer(df):
import numpy as np
return is_type(df, np.integer)
Run Code Online (Sandbox Code Playgroud)