我的桌子:
In [15]: csv=u"""a,a,,a
....: b,b,,b
....: c,c,,c
....: """
In [18]: df = pd.read_csv(io.StringIO(csv), header=None)
Run Code Online (Sandbox Code Playgroud)
将空列填入"UNKNOWN"
In [19]: df
Out[19]:
0 1 2 3
0 a a NaN a
1 b b NaN b
2 c c NaN c
In [20]: df.fillna({2:'UNKNOWN'})
Run Code Online (Sandbox Code Playgroud)
得到了错误
ValueError: could not convert string to float: UNKNOWN
Run Code Online (Sandbox Code Playgroud)
您的2列可能有一个浮点数dtype:
>>> df
0 1 2 3
0 a a NaN a
1 b b NaN b
2 c c NaN c
>>> df.dtypes
0 object
1 object
2 float64
3 object
dtype: object
Run Code Online (Sandbox Code Playgroud)
因此问题.如果您不介意将整个帧转换为object,您可以:
>>> df.astype(object).fillna("UNKNOWN")
0 1 2 3
0 a a UNKNOWN a
1 b b UNKNOWN b
2 c c UNKNOWN c
Run Code Online (Sandbox Code Playgroud)
根据是否存在非字符串数据,您可能希望更有选择性地转换列dtypes,和/或在读取时指定dtypes,但无论如何,上述应该可以正常工作.
更新:如果你想要保留dtype信息,而不是将其切换回来,我会采用另一种方式,只填写你想要的列,或者使用一个循环fillna:
>>> df
0 1 2 3 4 5
0 0 a a NaN a NaN
1 1 b b NaN b NaN
2 2 c c NaN c NaN
>>> df.dtypes
0 int64
1 object
2 object
3 float64
4 object
5 float64
dtype: object
>>> for col in df.columns[pd.isnull(df).all()]:
... df[col] = df[col].astype(object).fillna("UNKNOWN")
...
>>> df
0 1 2 3 4 5
0 0 a a UNKNOWN a UNKNOWN
1 1 b b UNKNOWN b UNKNOWN
2 2 c c UNKNOWN c UNKNOWN
>>> df.dtypes
0 int64
1 object
2 object
3 object
4 object
5 object
dtype: object
Run Code Online (Sandbox Code Playgroud)
或者(如果你正在使用all),那么甚至可能根本不使用fillna:
>>> df
0 1 2 3 4 5
0 0 a a NaN a NaN
1 1 b b NaN b NaN
2 2 c c NaN c NaN
>>> df.ix[:,pd.isnull(df).all()] = "UNKNOWN"
>>> df
0 1 2 3 4 5
0 0 a a UNKNOWN a UNKNOWN
1 1 b b UNKNOWN b UNKNOWN
2 2 c c UNKNOWN c UNKNOWN
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
17395 次 |
| 最近记录: |