MCG*_*ode 26 python dataframe pandas
我尝试将列从数据类型转换float64为int64使用:
df['column name'].astype(int64)
Run Code Online (Sandbox Code Playgroud)
但得到一个错误:
NameError:未定义名称"int64"
该列有多少人,但格式化为7500000.0,任何想法我怎么可以简单地将其更改float64为int64?
jez*_*ael 54
我认为你需要施展numpy.int64:
df = pd.DataFrame({'column name':[7500000.0,7500000.0, np.nan]})
print (df['column name'])
0 7500000.0
1 7500000.0
2 NaN
Name: column name, dtype: float64
df['column name'] = df['column name'].astype(np.int64)
Run Code Online (Sandbox Code Playgroud)
样品:
#http://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html
df['column name'] = df['column name'].astype('Int64')
print (df['column name'])
0 7500000
1 7500000
2 NaN
Name: column name, dtype: Int64
Run Code Online (Sandbox Code Playgroud)
如果某些NaNS IN列需要他们更换一些int(例如0)通过fillna,因为type的NaN是float:
df['column name'].astype(np.int64)
Run Code Online (Sandbox Code Playgroud)
还要检查文档 - 缺少数据转换规则
编辑:
用NaNs 转换值是错误的:
df = pd.DataFrame({'column name':[7500000.0,7500000.0]})
print (df['column name'])
0 7500000.0
1 7500000.0
Name: column name, dtype: float64
df['column name'] = df['column name'].astype(np.int64)
#same as
#df['column name'] = df['column name'].astype(pd.np.int64)
print (df['column name'])
0 7500000
1 7500000
Name: column name, dtype: int64
Run Code Online (Sandbox Code Playgroud)
您可能需要传入字符串'int64':
>>> import pandas as pd
>>> df = pd.DataFrame({'a': [1.0, 2.0]}) # some test dataframe
>>> df['a'].astype('int64')
0 1
1 2
Name: a, dtype: int64
Run Code Online (Sandbox Code Playgroud)
有一些替代方法可以指定 64 位整数:
>>> df['a'].astype('i8') # integer with 8 bytes (64 bit)
0 1
1 2
Name: a, dtype: int64
>>> import numpy as np
>>> df['a'].astype(np.int64) # native numpy 64 bit integer
0 1
1 2
Name: a, dtype: int64
Run Code Online (Sandbox Code Playgroud)
或者np.int64直接在您的列上使用(但它返回一个numpy.array):
>>> np.int64(df['a'])
array([1, 2], dtype=int64)
Run Code Online (Sandbox Code Playgroud)
这在 Pandas 0.23.4 中似乎有点问题?
如果有 np.nan 值,那么这将按预期抛出错误:
df['col'] = df['col'].astype(np.int64)
Run Code Online (Sandbox Code Playgroud)
但是,如果使用“忽略”,则不会像我期望的那样将任何值从 float 更改为 int:
df['col'] = df['col'].astype(np.int64,errors='ignore')
Run Code Online (Sandbox Code Playgroud)
如果我首先转换 np.nan,它会起作用:
df['col'] = df['col'].fillna(0).astype(np.int64)
df['col'] = df['col'].astype(np.int64)
Run Code Online (Sandbox Code Playgroud)
现在我不知道如何让空值代替零,因为这会将所有内容再次转换回浮点数:
df['col'] = df['col'].replace(0,np.nan)
Run Code Online (Sandbox Code Playgroud)