转换集到数据帧

Mat*_*ido 3 python dataframe pandas

如何将一组类别转换为 DataFrame?

例如:

A = [{'a', 'c'}, {'a', 'b'}, {'b', 'd'}, {'e'}]
Run Code Online (Sandbox Code Playgroud)

到:

    'a', 'b', 'c', 'd', 'e'
1    1 ,  0 ,  1 ,  0 ,  0 
2    1 ,  1 ,  0 ,  0 ,  0 
3    0 ,  1 ,  0 ,  1 ,  0 
4    0 ,  0 ,  0 ,  0 ,  1  

Run Code Online (Sandbox Code Playgroud)

Qua*_*ang 5

让我们尝试explode那么crosstab

s = pd.Series(A).explode()
pd.crosstab(s.index, s)
Run Code Online (Sandbox Code Playgroud)

输出:

col_0  a  b  c  d  e
row_0               
0      1  0  1  0  0
1      1  1  0  0  0
2      0  1  0  1  0
3      0  0  0  0  1
Run Code Online (Sandbox Code Playgroud)

选项 2get_dummiesexplode

pd.get_dummies(pd.Series(A).explode()).sum(level=0)
Run Code Online (Sandbox Code Playgroud)

输出:

   a  b  c  d  e
0  1  0  1  0  0
1  1  1  0  0  0
2  0  1  0  1  0
3  0  0  0  0  1
Run Code Online (Sandbox Code Playgroud)