从dicts列表中重复索引Python Pandas

j s*_*sad 4 python pandas

我有一个带有两个键的字典列表.第一个键是共享索引,第二个键是列名.我想将此列表转换为Pandas DataFrame对象.但是当我这样做时,我得到重复的索引行,其中每一行都有一列空白.

使用此代码:

import pandas as pd
l = [{'col_a': 0, 'idx': 0},
     {'col_b': 5, 'idx': 0},
     {'col_a': 1, 'idx': 1},
     {'col_b': 6, 'idx': 1},
     {'col_a': 2, 'idx': 2},
     {'col_b': 7, 'idx': 2},
     {'col_a': 3, 'idx': 3},
     {'col_b': 8, 'idx': 3},
     {'col_a': 4, 'idx': 4},
     {'col_b': 9, 'idx': 4}]

df = pd.DataFrame(l)
df = df.set_index('idx')
Run Code Online (Sandbox Code Playgroud)

我明白了

     col_a  col_b
idx              
0      0.0    NaN
0      NaN    5.0
1      1.0    NaN
1      NaN    6.0
2      2.0    NaN
2      NaN    7.0
3      3.0    NaN
3      NaN    8.0
4      4.0    NaN
4      NaN    9.0
Run Code Online (Sandbox Code Playgroud)

但我想要这个

         col_a  col_b
    idx              
    0      0.0    5.0
    1      1.0    6.0
    2      2.0    7.0
    3      3.0    8.0
    4      4.0    9.0   
Run Code Online (Sandbox Code Playgroud)

有任何想法吗?谢谢!

DSM*_*DSM 5

你可以分组idx并采取.first():

In [10]: df
Out[10]: 
   col_a  col_b  idx
0    0.0    NaN    0
1    NaN    5.0    0
2    1.0    NaN    1
3    NaN    6.0    1
4    2.0    NaN    2
5    NaN    7.0    2
6    3.0    NaN    3
7    NaN    8.0    3
8    4.0    NaN    4
9    NaN    9.0    4

In [11]: df.groupby("idx").first()
Out[11]: 
     col_a  col_b
idx              
0      0.0    5.0
1      1.0    6.0
2      2.0    7.0
3      3.0    8.0
4      4.0    9.0
Run Code Online (Sandbox Code Playgroud)

或致电pivot_table:

In [36]: df.pivot_table(index="idx")
Out[36]: 
     col_a  col_b
idx              
0      0.0    5.0
1      1.0    6.0
2      2.0    7.0
3      3.0    8.0
4      4.0    9.0
Run Code Online (Sandbox Code Playgroud)