pandas - 根据另一列中的每个唯一值计算DataFrame中值的出现次数

Question

pandas - 根据另一列中的每个唯一值计算DataFrame中值的出现次数

Sco*_*tin 3 python pivot-table dataframe pandas

假设我有一个DataFrame:

    term      score
0   this          0
1   that          1
2   the other     3
3   something     2
4   anything      1
5   the other     2
6   that          2
7   this          0
8   something     1

Run Code Online (Sandbox Code Playgroud)

我如何通过score列中的唯一值来计算term列中的实例？产生如下结果:

    term      score 0     score 1     score 2     score 3
0   this            2           0           0           0
1   that            0           1           1           0
2   the other       0           0           1           1
3   something       0           1           1           0
4   anything        0           1           0           0

Run Code Online (Sandbox Code Playgroud)

我在这里读到的相关问题包括Python Pandas计算和总结特定条件,以及pandas python中的COUNTIF在具有多个条件的多个列上,但似乎都不是我想要做的.pivot_table正如在这个问题中所提到的那样,它可能是相关的,但是由于缺乏经验和熊猫文档的简洁性而受到阻碍.谢谢你的任何建议.

Answer 1

jez*_*ael 6

使用groupby与size重塑的unstack,最后的add_prefix:

df = df.groupby(['term','score']).size().unstack(fill_value=0).add_prefix('score ')

Run Code Online (Sandbox Code Playgroud)

或使用crosstab:

df = pd.crosstab(df['term'],df['score']).add_prefix('score ')

Run Code Online (Sandbox Code Playgroud)

或者pivot_table:

df = (df.pivot_table(index='term',columns='score', aggfunc='size', fill_value=0)
        .add_prefix('score '))

Run Code Online (Sandbox Code Playgroud)

print (df)
score      score 0  score 1  score 2  score 3
term                                         
anything         0        1        0        0
something        0        1        1        0
that             0        1        1        0
the other        0        0        1        1
this             2        0        0        0

Run Code Online (Sandbox Code Playgroud)

我准备发布交叉表,然后你编辑了.然后我就像,转动!你编辑了:D.很好的答案 (3认同)

Answer 2

Sco*_*ton 6

你也可以使用,get_dummies,set_index,和sum与level参数:

(pd.get_dummies(df.set_index('term'), columns=['score'], prefix_sep=' ')
   .sum(level=0)
   .reset_index())

Run Code Online (Sandbox Code Playgroud)

输出:

        term  score 0  score 1  score 2  score 3
0       this        2        0        0        0
1       that        0        1        1        0
2  the other        0        0        1        1
3  something        0        1        1        0
4   anything        0        1        0        0

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，1 月前
查看次数：	879 次
最近记录：	7 年，1 月前