use*_*009 2 python dictionary dataframe python-3.x pandas
我有一个数据框,其中有一列包含字典列表。这是示例列值的样子:
[{'score': 0.09248554706573486, 'category': 'soccer', 'threshold': 0.13000713288784027}, {'score': 0.09267200529575348, 'category': 'soccer', 'threshold': 0.11795613169670105}, {'score': 0.1703065186738968, 'category': 'soccer', 'threshold': 0.2004493921995163}, {'score': 0.08060390502214432, 'category': 'basketball', 'threshold': 0.09613725543022156}, {'score': 0.16494056582450867, 'category': 'basketball', 'threshold': 0.2284235805273056}, {'score': 0.008428425528109074, 'category': 'basketball', 'threshold': 0.018201233819127083}, {'score': 0.0761604905128479, 'category': 'hockey', 'threshold': 0.0924532413482666}, {'score': 0.10853488743305206, 'category': 'basketball', 'threshold': 0.1252049058675766}, {'score': 0.0012563085183501244, 'category': 'soccer', 'threshold': 0.008611497469246387}, {'score': 0.058744996786117554, 'category': 'soccer', 'threshold': 0.08366610109806061}, {'score': 0.20794744789600372, 'category': 'rugby', 'threshold': 0.26308900117874146}, {'score': 0.1463163197040558, 'category': 'hockey', 'threshold': 0.18053030967712402}, {'score': 0.12938784062862396, 'category': 'hockey', 'threshold': 0.13267497718334198}, {'score': 0.09140244871377945, 'category': 'basketball', 'threshold': 0.13820350170135498}, {'score': 0.06976936012506485, 'category': 'hockey', 'threshold': 0.0989123210310936}, {'score': 0.05813559517264366, 'category': 'basketball', 'threshold': 0.06885409355163574}, {'score': 0.09365707635879517, 'category': 'hockey', 'threshold': 0.12393374741077423},]
Run Code Online (Sandbox Code Playgroud)
我想创建一个单独的数据框,它为每行获取上述列值,并生成一个数据框,其中“类别”是一列,该列的值是分数和阈值。
例如:
category | score | threshold
soccer | 0.09248554706573486 | 0.13000713288784027
soccer | 0.09267200529575348 | 0.13000713288784027
soccer | 0.1703065186738968 | 0.13000713288784027
basketball | 0.16494056582450867 | 0.018201233819127083
basketball | 0.08060390502214432 | 0.018201233819127083
basketball | 0.10853488743305206 | 0.018201233819127083
Run Code Online (Sandbox Code Playgroud)
假设lst输入列表,只需使用DataFrame构造函数:
df = pd.DataFrame(lst)
Run Code Online (Sandbox Code Playgroud)
输出:
score category threshold
0 0.092486 soccer 0.130007
1 0.092672 soccer 0.117956
2 0.170307 soccer 0.200449
3 0.080604 basketball 0.096137
4 0.164941 basketball 0.228424
5 0.008428 basketball 0.018201
6 0.076160 hockey 0.092453
7 0.108535 basketball 0.125205
8 0.001256 soccer 0.008611
9 0.058745 soccer 0.083666
10 0.207947 rugby 0.263089
11 0.146316 hockey 0.180530
12 0.129388 hockey 0.132675
13 0.091402 basketball 0.138204
14 0.069769 hockey 0.098912
15 0.058136 basketball 0.068854
16 0.093657 hockey 0.123934
Run Code Online (Sandbox Code Playgroud)
如果该系列中的每个项目都有这样的列表,请使用itertools.chain:
from itertools import chain
df2 = pd.DataFrame(chain.from_iterable(df['col']))
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
38 次 |
| 最近记录: |