Rec*_*tan 5 dataframe python-3.x pandas json-normalize
我正在尝试转换包含其他数据帧的数据帧,例如:
{
'id': 3241234,
'data': {
'name':'carol',
'lastname': 'netflik',
'office': {
'num': 3543,
'department': 'trigy'
}
}
}
Run Code Online (Sandbox Code Playgroud)
我尝试使用:
pd.DataFrame.from_dict(data)
Run Code Online (Sandbox Code Playgroud)
但结果数据框如下所示:
id data
lastname 3241234 netflik
name 3241234 carol
office 3241234 {'num': 3543, 'department': 'trigy'}
Run Code Online (Sandbox Code Playgroud)
任何想法?
Tre*_*ney 11
.json_normalized扩展dict.import pandas as pd
data = {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}
df = pd.json_normalize(data)
# display(df)
id data.name data.lastname data.office.num data.office.department
0 3241234 carol netflik 3543 trigy
Run Code Online (Sandbox Code Playgroud)
dicts# dataframe with column of dicts
df = pd.DataFrame({'col2': [1, 2, 3], 'col': [data, data, data]})
# display(df)
col2 col
0 1 {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}
1 2 {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}
2 3 {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}
# normalize the column of dicts
normalized = pd.json_normalize(df['col'])
# join the normalized column to df
df = df.join(normalized).drop(columns=['col'])
# display(df)
col2 id data.name data.lastname data.office.num data.office.department
0 1 3241234 carol netflik 3543 trigy
1 2 3241234 carol netflik 3543 trigy
2 3 3241234 carol netflik 3543 trigy
Run Code Online (Sandbox Code Playgroud)
listswithdictsdicts中删除lists.explodedata = [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]
df = pd.DataFrame({'col2': [1, 2, 3], 'col': [data, data, data]})
# display(df)
col2 col
0 1 [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]
1 2 [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]
2 3 [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]
# explode the lists
df = df.explode('col', ignore_index=True)
# remove and normalize the column of dicts
normalized = pd.json_normalize(df.pop('col'))
# join the normalized column to df
df = df.join(normalized)
Run Code Online (Sandbox Code Playgroud)