Bla*_*ven 4 python dictionary list dataframe pandas
我有一个“字典的字典”列表,如下所示:
lis = [{'Health and Welfare Plan + Change Notification': {'evidence_capture': 'null',
'test_result_justification': 'null',
'latest_test_result_date': 'null',
'last_updated_by': 'null',
'test_execution_status': 'Not Started',
'test_result': 'null'}},
{'Health and Welfare Plan + Computations': {'evidence_capture': 'null',
'test_result_justification': 'null',
'latest_test_result_date': 'null',
'last_updated_by': 'null',
'test_execution_status': 'Not Started',
'test_result': 'null'}},
{'Health and Welfare Plan + Data Agreements': {'evidence_capture': 'null',
'test_result_justification': 'Due to the Policy',
'latest_test_result_date': '2019-10-02',
'last_updated_by': 'null',
'test_execution_status': 'In Progress',
'test_result': 'null'}},
{'Health and Welfare Plan + Data Elements': {'evidence_capture': 'null',
'test_result_justification': 'xxx',
'latest_test_result_date': '2019-10-02',
'last_updated_by': 'null',
'test_execution_status': 'In Progress',
'test_result': 'null'}},
{'Health and Welfare Plan + Data Quality Monitoring': {'evidence_capture': 'null',
'test_result_justification': 'xxx',
'latest_test_result_date': '2019-08-09',
'last_updated_by': 'null',
'test_execution_status': 'Completed',
'test_result': 'xxx'}},
{'Health and Welfare Plan + HPU Source Reliability': {'evidence_capture': 'null',
'test_result_justification': 'xxx.',
'latest_test_result_date': '2019-10-02',
'last_updated_by': 'null',
'test_execution_status': 'In Progress',
'test_result': 'null'}},
{'Health and Welfare Plan + Lineage': {'evidence_capture': 'null',
'test_result_justification': 'null',
'latest_test_result_date': 'null',
'last_updated_by': 'null',
'test_execution_status': 'Not Started',
'test_result': 'null'}},
{'Health and Welfare Plan + Metadata': {'evidence_capture': 'null',
'test_result_justification': 'Valid',
'latest_test_result_date': '2020-07-02',
'last_updated_by': 'null',
'test_execution_status': 'Completed',
'test_result': 'xxx'}},
{'Health and Welfare Plan + Usage Reconciliation': {'evidence_capture': 'null',
'test_result_justification': 'Test out of scope',
'latest_test_result_date': '2019-10-02',
'last_updated_by': 'null',
'test_execution_status': 'In Progress',
'test_result': 'null'}}]
Run Code Online (Sandbox Code Playgroud)
我想将列表转换为如下所示的数据框:
evidence_capture last_updated_by latest_test_result_date test_execution_status test_result test_result_justification test_category
Change Notification null null null Not Started null null Health and Welfare Plan
Computations null null null Not Started null null Health and Welfare Plan
Data Agreements null null 2019-10-02 In Progress null Due to the Policy Health and Welfare Plan
Data Elements null null 2019-10-02 In Progress null xxx Health and Welfare Plan
Data Quality Monitoring null null 2019-08-09 Completed xxx xxx Health and Welfare Plan
HPU Source Reliability null null 2019-10-02 In Progress null xxx. Health and Welfare Plan
Lineage null null null Not Started null null Health and Welfare Plan
Metadata null null 2020-07-02 Completed xxx Valid Health and Welfare Plan
Usage Reconciliation null null 2019-10-02 In Progress null Test out of scope Health and Welfare Plan
Run Code Online (Sandbox Code Playgroud)
我构建数据框的代码使用 for 循环逐列连接记录。之后处理列名,然后转置它。最终输出将从每行索引中删除重复的字符串“Health and Welfare Plan”,但附加为新列。
df3 = pd.DataFrame(lis[0])
for i in range(1, len(lis)):
df3 = pd.concat([df3, pd.DataFrame(lis[i])], axis=1)
df3.columns = [col.split(' + ')[1] for col in df3.columns]
df3 = df3.T
df3['test_category'] = 'Health and Welfare Plan'
print(df3)
Run Code Online (Sandbox Code Playgroud)
该代码能够生成最终输出,但使用 for 循环和数据帧连接的“昂贵”函数。所以我想知道是否有更好的方法来输出相同的结果?
让我们做 dict comp 来压平字典列表
pd.DataFrame({k.split(' + ')[1]: v for d in lis for k, v in d.items()}).T
Run Code Online (Sandbox Code Playgroud)
evidence_capture test_result_justification latest_test_result_date last_updated_by test_execution_status test_result
Change Notification null null null null Not Started null
Computations null null null null Not Started null
Data Agreements null Due to the Policy 2019-10-02 null In Progress null
Data Elements null xxx 2019-10-02 null In Progress null
Data Quality Monitoring null xxx 2019-08-09 null Completed xxx
HPU Source Reliability null xxx. 2019-10-02 null In Progress null
Lineage null null null null Not Started null
Metadata null Valid 2020-07-02 null Completed xxx
Usage Reconciliation null Test out of scope 2019-10-02 null In Progress null
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
655 次 |
| 最近记录: |