pandas.to_dict返回None与nan混合

Pio*_*oda 8 python-3.x pandas

我偶然发现了熊猫的一个小问题,它的方法是to_dict.我有一张表,我确定每行都有相同数量的相同列,让我们说它看起来像这样:

+----|----|----+
|COL1|COL2|COL3|
+----|----|----+
|VAL1|    |VAL3|
|    |VAL2|VAL3|
|VAL1|VAL2|    |
+----|----|----+
Run Code Online (Sandbox Code Playgroud)

当我这样做时,df.to_dict(orient='records')我得到:

[{
     "COL1":"VAL1"
     ,"COL2":nan
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":nan
}]
Run Code Online (Sandbox Code Playgroud)

注意nan在某些列和None其他列中(总是相同,似乎没有nanNone在同一列中)

当我这样做时,我json.loads(df.to_json(orient='records'))只得到None没有nan(这是期望的输出).

像这样:

[{
     "COL1":"VAL1"
     ,"COL2":None
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":None
}]
Run Code Online (Sandbox Code Playgroud)

我会理解为什么它会发生以及是否可以某种方式控制它的一些解释.

== ==编辑

根据评论,最好先用nan's 取代那些None,但那些nan不是np.nan:

>>> a = df.head().ix[0,60]
>>> a
nan
>>> type(a)
<class 'numpy.float64'>
>>> a is np.nan
False
>>> a == np.nan
False
Run Code Online (Sandbox Code Playgroud)

jez*_*ael 7

我认为你只能replace,它不可能控制在to_dict:

L = [{
     "COL1":"VAL1"
     ,"COL2":np.nan
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":None
     ,"COL2":"VAL2"
     ,"COL3":"VAL3"
 }
 ,{
     "COL1":"VAL1"
     ,"COL2":"VAL2"
     ,"COL3":np.nan
}]

df = pd.DataFrame(L).replace({np.nan:None})
print (df)
   COL1  COL2  COL3
0  VAL1  None  VAL3
1  None  VAL2  VAL3
2  VAL1  VAL2  None

print (df.to_dict(orient='records'))
[{'COL3': 'VAL3', 'COL2': None, 'COL1': 'VAL1'}, 
 {'COL3': 'VAL3', 'COL2': 'VAL2', 'COL1': None}, 
 {'COL3': None, 'COL2': 'VAL2', 'COL1': 'VAL1'}]
Run Code Online (Sandbox Code Playgroud)