Pla*_*nor 5 python dataframe pandas
嗨,我有一个类似于下面的熊猫df
information record
name apple
size {'weight':{'gram':300,'oz':10.5},'description':{'height':10,'width':15}}
country America
partiesrelated [{'nameOfFarmer':'John Smith'},{'farmerID':'A0001'}]
Run Code Online (Sandbox Code Playgroud)
我想把df转换成另一个像这样的df
information record
name apple
size_weight_gram 300
size_weight_oz 10.5
size_description_height 10
size_description_width 15
country America
partiesrelated_nameOfFarmer John Smith
partiesrelated_farmerID A0001
Run Code Online (Sandbox Code Playgroud)
在这种情况下,字典将解析成单行,其中size_weight_gram包含值.
的代码 df
df = pd.DataFrame({'information': ['name', 'size', 'country', 'partiesrealated'],
'record': ['apple', {'weight':{'gram':300,'oz':10.5},'description':{'height':10,'width':15}}, 'America', [{'nameOfFarmer':'John Smith'},{'farmerID':'A0001'}]]})
df = df.set_index('information')
Run Code Online (Sandbox Code Playgroud)
IIUC,您可以定义一个递归函数来取消序列/字典的嵌套,直到您拥有一个键、值列表,它们既可以作为pd.DataFrame构造函数的有效输入,也可以按照您所描述的方式进行格式化。
看看这个解决方案:
import itertools
import collections
ch = lambda ite: list(itertools.chain.from_iterable(ite))
def isseq(obj):
if isinstance(obj, str): return False
return isinstance(obj, collections.abc.Sequence)
def unnest(k, v):
if isseq(v): return ch([unnest(k, v_) for v_ in v])
if isinstance(v, dict): return ch([unnest("_".join([k, k_]), v_) for k_, v_ in v.items()])
return k,v
def pairwise(i):
_a = iter(i)
return list(zip(_a, _a))
a = ch([(unnest(k, v)) for k, v in zip(d['information'], d['record'])])
pd.DataFrame(pairwise(a))
0 1
0 name apple
1 size_weight_gram 300
2 size_weight_oz 10.5
3 size_description_height 10
4 size_description_width 15
5 country America
6 partiesrealated_nameOfFarmer John Smith
7 partiesrealated_farmerID A0001
Run Code Online (Sandbox Code Playgroud)
由于解决方案的递归性质,该算法将解除嵌套到您可能拥有的任何深度。例如:
d={
'information': [
'row1',
'row2',
'row3',
'row4'
],
'record': [
'val1',
{
'val2': {
'a': 300,
'b': [
{
"b1": 10.5
},
{
"b2": 2
}
]
},
'val3': {
'a': 10,
'b': 15
}
},
'val4',
[
{
'val5': [
{
'a': {
'c': [
{
'd': {
'e': [
{
'f': 1
},
{
'g': 3
}
]
}
}
]
}
}
]
},
{
'b': 'bar'
}
]
]
}
0 1
0 row1 val1
1 row2_val2_a 300
2 row2_val2_b_b1 10.5
3 row2_val2_b_b2 2
4 row2_val3_a 10
5 row2_val3_b 15
6 row3 val4
7 row4_val5_a_c_d_e_f 1
8 row4_val5_a_c_d_e_g 3
9 row4_b bar
Run Code Online (Sandbox Code Playgroud)