将包含另一个词典列表的词典列表转换为dataframe

Question

将包含另一个词典列表的词典列表转换为dataframe

rau*_*002 5 python dictionary dataframe pandas

我试图找到解决方案,但我无法得到1.我在python中有一个api的以下输出.

insights = [ <Insights> {
    "account_id": "1234",
    "actions": [
        {
            "action_type": "add_to_cart",
            "value": "8"
        },
        {
            "action_type": "purchase",
            "value": "2"
        }
    ],
    "cust_id": "xyz123",
    "cust_name": "xyz",
}, <Insights> {
    "account_id": "1234",
    "cust_id": "pqr123",
    "cust_name": "pqr",
},  <Insights> {
    "account_id": "1234",
    "actions": [
        {
            "action_type": "purchase",
            "value": "45"
        }
    ],
    "cust_id": "abc123",
    "cust_name": "abc",
 }
 ]

Run Code Online (Sandbox Code Playgroud)

我希望数据框像这样

- account_id    add_to_cart purchase    cust_id cust_name
- 1234                    8        2    xyz123  xyz
- 1234                                  pqr123  pqr
- 1234                            45    abc123  abc

Run Code Online (Sandbox Code Playgroud)

当我使用以下内容时

> insights_1 = [x for x in insights]

> df = pd.DataFrame(insights_1)

Run Code Online (Sandbox Code Playgroud)

我得到以下内容

- account_id                                       actions  cust_id cust_name
- 1234  [{'value': '8', 'action_type': 'add_to_cart'},{'value': '2', 'action_type': 'purchase'}]                                    xyz123  xyz
- 1234                                              NaN     pqr123  pqr
- 1234  [{'value': '45', 'action_type': 'purchase'}]        abc123  abc

Run Code Online (Sandbox Code Playgroud)

我该如何继续前进？

Answer 1

jpp*_*jpp 4

这是一种解决方案。

df = pd.DataFrame(insights)

parts = [pd.DataFrame({d['action_type']: d['value'] for d in x}, index=[0])
         if x == x else pd.DataFrame({'add_to_cart': [np.nan], 'purchase': [np.nan]})
         for x in df['actions']]

df = df.drop('actions', 1)\
       .join(pd.concat(parts, axis=0, ignore_index=True))

print(df)

  account_id cust_id cust_name add_to_cart purchase
0       1234  xyz123       xyz           8        2
1       1234  pqr123       pqr         NaN      NaN
2       1234  abc123       abc         NaN       45

Run Code Online (Sandbox Code Playgroud)

解释

用于pandas将外部字典列表读入数据帧。
对于内部字典，将列表理解与字典理解一起使用。
nan通过测试列表理解中的相等性来考虑值。
将各个部分连接并连接到原始数据帧。

说明 - 详细信息

这详细说明了以下结构和使用parts：

将每个条目放入df['actions']; 每个条目将是一个字典列表。
在循环中逐一（即按行）迭代它们for。
该else部分说“如果它是np.nan[即空]，则返回 s 的数据帧nan”。该if部分获取字典列表并为每一行创建一个迷你数据框。
然后，我们使用下一部分来连接这些迷你字典，每一行一个，并将它们连接到原始数据帧。

归档时间：	7 年，8 月前
查看次数：	295 次
最近记录：	7 年，8 月前