在Pandas中将浮动转换为int？

Question

在Pandas中将浮动转换为int？

MJP*_*MJP 192 python floating-point integer dataset pandas

我一直在处理从CSV导入的数据.Pandas将一些列更改为float,所以现在这些列中的数字显示为浮点数!但是,我需要将它们显示为整数,或者不使用逗号.有没有办法将它们转换为整数或不显示逗号？

Answer 1

要修改浮点输出,请执行以下操作:

df= pd.DataFrame(range(5), columns=['a'])
df.a = df.a.astype(float)
df

Out[33]:

          a
0 0.0000000
1 1.0000000
2 2.0000000
3 3.0000000
4 4.0000000

pd.options.display.float_format = '{:,.0f}'.format
df

Out[35]:

   a
0  0
1  1
2  2
3  3
4  4

Run Code Online (Sandbox Code Playgroud)

谢谢!我在to_csv中调整了这个:fin.to_csv('my_table.csv',float_format ='%.f').有效! (13认同)
在最新版本的pandas中,您需要将copy = False添加到astype的参数中以避免出现警告 (4认同)

Answer 2

Rya*_*n G 155

使用该.astype(<type>)函数来操作列dtypes.

>>> df = pd.DataFrame(np.random.rand(3,4), columns=list("ABCD"))
>>> df
          A         B         C         D
0  0.542447  0.949988  0.669239  0.879887
1  0.068542  0.757775  0.891903  0.384542
2  0.021274  0.587504  0.180426  0.574300
>>> df[list("ABCD")] = df[list("ABCD")].astype(int)
>>> df
   A  B  C  D
0  0  0  0  0
1  0  0  0  0
2  0  0  0  0

Run Code Online (Sandbox Code Playgroud)

编辑:

要处理缺失的值:

>>> df
          A         B     C         D
0  0.475103  0.355453  0.66  0.869336
1  0.260395  0.200287   NaN  0.617024
2  0.517692  0.735613  0.18  0.657106
>>> df[list("ABCD")] = df[list("ABCD")].fillna(0.0).astype(int)
>>> df
   A  B  C  D
0  0  0  0  0
1  0  0  0  0
2  0  0  0  0
>>>

Run Code Online (Sandbox Code Playgroud)

@MJP如果缺少值,则无法将系列从float转换为整数,请参阅http://pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na,您必须使用浮点数 (6认同)
我做了一个编辑,其中所有NaN都被0.0替换. (4认同)
我尝试了你的方法,它给了我一个ValueError:无法将NA转换为整数 (3认同)
或者更好的是,如果你只修改一个CSV,那么:df.to_csv("path.csv",na_rep ="",float_format ="%.0f",index = False)但是这将编辑所有的浮点数,所以将FK列转换为字符串,进行操作,然后保存可能会更好. (3认同)
值不会丢失,但列不会故意为每行指定值.有没有办法实现变通方法？由于这些值是外键ID,我需要整数. (2认同)

Answer 3

小智 30

使用列名列表,使用.applymap()更改多列的类型,或使用.apply()更改单个列的类型.

    df = pd.DataFrame(10*np.random.rand(3, 4), columns=list("ABCD"))

              A         B         C         D
    0  8.362940  0.354027  1.916283  6.226750
    1  1.988232  9.003545  9.277504  8.522808
    2  1.141432  4.935593  2.700118  7.739108

    cols = ['A', 'B']
    df[cols] = df[cols].applymap(np.int64)

       A  B         C         D
    0  8  0  1.916283  6.226750
    1  1  9  9.277504  8.522808
    2  1  4  2.700118  7.739108

    df['C'] = df['C'].apply(np.int64)
       A  B  C         D
    0  8  0  1  6.226750
    1  1  9  9  8.522808
    2  1  4  2  7.739108

Run Code Online (Sandbox Code Playgroud)

如果值中有NaN怎么办？ (4认同)
@ Zhang18我尝试了这个解决方案,如果是NaN,你会遇到这样的错误:`ValueError:('无法将float NaN转换为整数',u'occurred at index <column_name>') (2认同)
@enri：可以尝试以下代码-`df ['C'] = df ['C']。dropna（）。apply（np.int64）` (2认同)

Answer 4

tdy*_*tdy 27

用于`'Int64'`NaN 支持

astype(int)并且astype('int64') 无法处理缺失值（numpy int）
astype('Int64')（注意大写I）可以处理缺失值 （pandas int）

df['A'] = df['A'].astype('Int64') # capital I

Run Code Online (Sandbox Code Playgroud)

这假设您希望将缺失值保留为 NaN。如果您打算对它们进行估算，您可以fillna首先按照瑞安的建议进行。

`'Int64'`（大写`I`）的示例

如果浮点数已经被舍入，只需使用astype：

df = pd.DataFrame({'A': [99.0, np.nan, 42.0]})

df['A'] = df['A'].astype('Int64')
#       A
# 0    99
# 1  <NA>
# 2    42

Run Code Online (Sandbox Code Playgroud)

如果浮点数尚未舍入，round则之前astype：

df = pd.DataFrame({'A': [3.14159, np.nan, 1.61803]})

df['A'] = df['A'].round().astype('Int64')
#       A
# 0     3
# 1  <NA>
# 2     2

Run Code Online (Sandbox Code Playgroud)

要从文件中读取 int+NaN 数据，请使用dtype='Int64'来完全避免转换的需要：

csv = io.StringIO('''
id,rating
foo,5
bar,
baz,2
''')

df = pd.read_csv(csv, dtype={'rating': 'Int64'})
#     id  rating
# 0  foo       5
# 1  bar    <NA>
# 2  baz       2

Run Code Online (Sandbox Code Playgroud)

笔记

'Int64'是的别名Int64Dtype：
```
df['A'] = df['A'].astype(pd.Int64Dtype()) # same as astype('Int64')
```
Run Code Online (Sandbox Code Playgroud)

大小/签名别名可用：

	下限	上限
`'Int8'`	-128	127
`'Int16'`	-32,768	32,767
`'Int32'`	-2,147,483,648	2,147,483,647
`'Int64'`	-9,223,372,036,854,775,808	9,223,372,036,854,775,807
`'UInt8'`	0	255
`'UInt16'`	0	65,535
`'UInt32'`	0	4,294,967,295
`'UInt64'`	0	18,446,744,073,709,551,615

Answer 5

Suh*_*ote 17

将所有浮点列转换为 int

>>> df = pd.DataFrame(np.random.rand(5, 4) * 10, columns=list('PQRS'))
>>> print(df)
...     P           Q           R           S
... 0   4.395994    0.844292    8.543430    1.933934
... 1   0.311974    9.519054    6.171577    3.859993
... 2   2.056797    0.836150    5.270513    3.224497
... 3   3.919300    8.562298    6.852941    1.415992
... 4   9.958550    9.013425    8.703142    3.588733

>>> float_col = df.select_dtypes(include=['float64']) # This will select float columns only
>>> # list(float_col.columns.values)

>>> for col in float_col.columns.values:
...     df[col] = df[col].astype('int64')

>>> print(df)
...     P   Q   R   S
... 0   4   0   8   1
... 1   0   9   6   3
... 2   2   0   5   3
... 3   3   8   6   1
... 4   9   9   8   3

Run Code Online (Sandbox Code Playgroud)

Answer 6

enr*_*nri 10

如果您想将Pandas DataFrame df的更多列从float转换为整数,这是一个快速的解决方案,同时考虑到您可以拥有NaN值的情况.

cols = ['col_1', 'col_2', 'col_3', 'col_4']
for col in cols:
   df[col] = df[col].apply(lambda x: int(x) if x == x else "")

Run Code Online (Sandbox Code Playgroud)

我尝试过:

 else x)
 else None)

Run Code Online (Sandbox Code Playgroud)

但结果仍然是浮点数,所以我用了 else ""

Answer 7

小智 8

>>> import pandas as pd
>>> right = pd.DataFrame({'C': [1.002, 2.003], 'D': [1.009, 4.55], 'key': ['K0', 'K1']})
>>> print(right)
           C      D key
    0  1.002  1.009  K0
    1  2.003  4.550  K1
>>> right['C'] = right.C.astype(int)
>>> print(right)
       C      D key
    0  1  1.009  K0
    1  2  4.550  K1

Run Code Online (Sandbox Code Playgroud)

Answer 8

aeb*_*mad 7

扩展@Ryan G提到的.astype(<type>)函数用法，可以使用该errors=ignore参数仅转换那些不会产生错误的列，从而显着简化了语法。显然，在忽略错误时应格外小心，但对于此任务来说非常方便。

df = pd.DataFrame(np.random.rand(3,4), columns=list("ABCD"))
df *= 10
df

    A       B       C       D
0   2.16861 8.34139 1.83434 6.91706
1   5.85938 9.71712 5.53371 4.26542
2   0.50112 4.06725 1.99795 4.75698

df['E'] = list("XYZ")
df.astype(int, errors='ignore')

    A   B   C   D   E
0   2   8   1   6   X
1   5   9   5   4   Y
2   0   4   1   4   Z

Run Code Online (Sandbox Code Playgroud)

从astype文档：

错误：{'raise'，'ignore'}，默认为'raise'

控制针对提供的dtype的无效数据引发异常。

引发：允许引发异常

忽略：抑制异常。错误返回原始对象

0.20.0版中的新功能。

Answer 9

pra*_*nth 5

需要转换为 int 的列也可以在字典中提及，如下所示

df = df.astype({'col1': 'int', 'col2': 'int', 'col3': 'int'})

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年前
查看次数：	380634 次
最近记录：	6 年，8 月前

在Pandas中将浮动转换为int？

用于'Int64'NaN 支持

'Int64'（大写I）的示例

笔记

将所有浮点列转换为 int

用于`'Int64'`NaN 支持

`'Int64'`（大写`I`）的示例