我偶然发现了熊猫的一个小问题,它的方法是to_dict.我有一张表,我确定每行都有相同数量的相同列,让我们说它看起来像这样:
+----|----|----+
|COL1|COL2|COL3|
+----|----|----+
|VAL1| |VAL3|
| |VAL2|VAL3|
|VAL1|VAL2| |
+----|----|----+
Run Code Online (Sandbox Code Playgroud)
当我这样做时,df.to_dict(orient='records')我得到:
[{
"COL1":"VAL1"
,"COL2":nan
,"COL3":"VAL3"
}
,{
"COL1":None
,"COL2":"VAL2"
,"COL3":"VAL3"
}
,{
"COL1":"VAL1"
,"COL2":"VAL2"
,"COL3":nan
}]
Run Code Online (Sandbox Code Playgroud)
注意nan在某些列和None其他列中(总是相同,似乎没有nan和None在同一列中)
当我这样做时,我json.loads(df.to_json(orient='records'))只得到None没有nan(这是期望的输出).
像这样:
[{
"COL1":"VAL1"
,"COL2":None
,"COL3":"VAL3"
}
,{
"COL1":None
,"COL2":"VAL2"
,"COL3":"VAL3"
}
,{
"COL1":"VAL1"
,"COL2":"VAL2"
,"COL3":None
}]
Run Code Online (Sandbox Code Playgroud)
我会理解为什么它会发生以及是否可以某种方式控制它的一些解释.
== ==编辑
根据评论,最好先用nan's 取代那些None,但那些nan不是np.nan:
>>> a = df.head().ix[0,60]
>>> a
nan
>>> type(a)
<class 'numpy.float64'>
>>> a is np.nan
False
>>> a == np.nan
False
Run Code Online (Sandbox Code Playgroud)
L = [{
"COL1":"VAL1"
,"COL2":np.nan
,"COL3":"VAL3"
}
,{
"COL1":None
,"COL2":"VAL2"
,"COL3":"VAL3"
}
,{
"COL1":"VAL1"
,"COL2":"VAL2"
,"COL3":np.nan
}]
df = pd.DataFrame(L).replace({np.nan:None})
print (df)
COL1 COL2 COL3
0 VAL1 None VAL3
1 None VAL2 VAL3
2 VAL1 VAL2 None
print (df.to_dict(orient='records'))
[{'COL3': 'VAL3', 'COL2': None, 'COL1': 'VAL1'},
{'COL3': 'VAL3', 'COL2': 'VAL2', 'COL1': None},
{'COL3': None, 'COL2': 'VAL2', 'COL1': 'VAL1'}]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3937 次 |
| 最近记录: |