如何将嵌套字典转换为 pandas 数据框

Rec*_*tan 5 dataframe python-3.x pandas json-normalize

我正在尝试转换包含其他数据帧的数据帧,例如:

{
  'id': 3241234,
  'data': {
           'name':'carol',
           'lastname': 'netflik',
           'office': {
                       'num': 3543,
                       'department': 'trigy'
                    }
        }


}
Run Code Online (Sandbox Code Playgroud)

我尝试使用:

pd.DataFrame.from_dict(data)
Run Code Online (Sandbox Code Playgroud)

但结果数据框如下所示:

               id                                  data
lastname  3241234                               netflik
name      3241234                                 carol
office    3241234  {'num': 3543, 'department': 'trigy'}
Run Code Online (Sandbox Code Playgroud)

任何想法?

Tre*_*ney 11

加载 JSON/字典:

import pandas as pd

data = {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}

df = pd.json_normalize(data)

# display(df)
        id data.name data.lastname  data.office.num data.office.department
0  3241234     carol       netflik             3543                  trigy
Run Code Online (Sandbox Code Playgroud)

如果数据框的列为dicts

# dataframe with column of dicts
df = pd.DataFrame({'col2': [1, 2, 3], 'col': [data, data, data]})

# display(df)
   col2                                                                                                                col
0     1  {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}
1     2  {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}
2     3  {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}

# normalize the column of dicts
normalized = pd.json_normalize(df['col'])

# join the normalized column to df
df = df.join(normalized).drop(columns=['col'])

# display(df)
   col2       id data.name data.lastname  data.office.num data.office.department
0     1  3241234     carol       netflik             3543                  trigy
1     2  3241234     carol       netflik             3543                  trigy
2     3  3241234     carol       netflik             3543                  trigy
Run Code Online (Sandbox Code Playgroud)

如果数据框有一列listswithdicts

  • 需要从withdicts中删除lists.explode
data = [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]

df = pd.DataFrame({'col2': [1, 2, 3], 'col': [data, data, data]})

# display(df)
   col2                                                                                                                  col
0     1  [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]
1     2  [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]
2     3  [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]

# explode the lists
df = df.explode('col', ignore_index=True)

# remove and normalize the column of dicts
normalized = pd.json_normalize(df.pop('col'))

# join the normalized column to df
df = df.join(normalized)
Run Code Online (Sandbox Code Playgroud)