Python Pandas 系列:将浮点数转换为字符串,保留空值

mef*_*ons 6 python numpy pandas

转换为字符串后如何保留空值?我正在处理社会安全号码,有必要在浮点数和字符串之间来回切换。

import pandas as pd
import numpy as np    
x = pd.Series([np.nan, 123., np.nan, 456.], dtype = float)
x.isnull()
Run Code Online (Sandbox Code Playgroud)

...有空值

y = x.astype(str)
y.isnull()
Run Code Online (Sandbox Code Playgroud)

...没有空值

So ideally x.isnull() and y.isnull() would be the same.

I think it's dangerous to use a Series of mixed dtypes, but thinking this is the best solution for the time being:

z = y.copy()
z[z == 'nan'] = np.nan
z.isnull() # works as desired
type(z[0]) # but has floats for nulls
type(z[1]) # and strings for values
Run Code Online (Sandbox Code Playgroud)

小智 13

我也遇到过这个问题,但是对于DataFrames。对 pandas Series 和 DataFrame 都有效的方法是使用 mask():

data = pd.Series([np.NaN, 10, 30, np.NaN]) # Also works for pd.DataFrame
null_cells = data.isnull()
data = data.astype(str).mask(null_cells, np.NaN)
Run Code Online (Sandbox Code Playgroud)


小智 11

您还可以在 pandas >= 1.0 中使用“string”dtype 代替 str:

y = x.astype("string")
Run Code Online (Sandbox Code Playgroud)

应保留 NaN。

pandas 文档中对此进行了描述:https://pandas.pydata.org/pandas-docs/stable/user_guide/text.html


chr*_*isb 6

您可以将 to 转换为字符串,条件是不为空。

x[x.notnull()] = x.astype(str)

x
Out[32]
0      NaN
1    123.0
2      NaN
3    456.0
dtype: object

x.values
Out[33]: array([nan, '123.0', nan, '456.0'], dtype=object)

x.isnull()
Out[34]
0     True
1    False
2     True
3    False
dtype: bool
Run Code Online (Sandbox Code Playgroud)