ba_*_*_ul 3 python dictionary dataframe pandas
我的数据如下所示:
{ outer_key1 : [ {key1: some_value},
{key2: some_value},
{key3: some_value} ],
outer_key2 : [ {key1: some_value},
{key2: some_value},
{key3: some_value} ] }
Run Code Online (Sandbox Code Playgroud)
内部数组的长度始终相同。key1、key2、key3 也总是相同的。
我想将其转换为 Pandas DataFrame,其中 outer_key1、outer_key2、... 是索引,key1、key2、key3 是列。
编辑:
数据中存在问题,我认为这是给定解决方案不起作用的原因。在少数情况下,内部数组中有三个Nones 而不是三个字典。像这样:
outer_key3: [ None, None, None ]
这是一种方法:
d = { 'O1' : [ {'K1': 1},
{'K2': 2},
{'K3': 3} ],
'O2' : [ {'K1': 4},
{'K2': 5},
{'K3': 6} ] }
d = {k: { k: v for d in L for k, v in d.items() } for k, L in d.items()}
df = pd.DataFrame.from_dict(d, orient='index')
# K1 K2 K3
# O1 1 2 3
# O2 4 5 6
Run Code Online (Sandbox Code Playgroud)
替代解决方案:
df = pd.DataFrame(d).T
Run Code Online (Sandbox Code Playgroud)
比较繁琐的None数据方法:
d = { 'O1' : [ {'K1': 1},
{'K2': 2},
{'K3': 3} ],
'O2' : [ {'K1': 4},
{'K2': 5},
{'K3': 6} ],
'O3' : [ {'K1': None},
{'K2': None},
{'K3': None} ] }
d = {k: v if isinstance(v[0], dict) else [{k: None} for k in ('K1', 'K2','K3')] for k, v in d.items()}
d = {k: { k: v for d in L for k, v in d.items() } for k, L in d.items()}
df = pd.DataFrame.from_dict(d, orient='index')
# K1 K2 K3
# O1 1.0 2.0 3.0
# O2 4.0 5.0 6.0
# O3 NaN NaN NaN
Run Code Online (Sandbox Code Playgroud)