获取在pandas数据框中满足特定条件的行(字符串)的百分比

Question

获取在pandas数据框中满足特定条件的行(字符串)的百分比

Bad*_*ing 5 python pandas pandas-groupby

我有这个数据框:

df = pd.DataFrame({"A": ["Used", "Not used", "Not used", "Not used", "Used",
                         "Not used", "Used", "Used", "Used", "Not used"],
                   "B": ["Used", "Used", "Used", "Not used", "Not used",
                        "Used", "Not used", "Not used", "Used", "Not used"]})

Run Code Online (Sandbox Code Playgroud)

我想找到最快,最干净的方法来找出以下内容:

使用过A的所有行的行百分比.
使用过B的所有行的行百分比.
使用A和B的所有行的行百分比.

我是Python和pandas(以及一般编码)的新手,所以我确信这很简单,但任何指导都会受到赞赏.我已经尝试过groupby().aggregate(sum)但是我没有得到我需要的结果(我想是因为这些是字符而不是整数.

Answer 1

jez*_*ael 9

如果需要的所有值的百分比使用value_counts与normalize=True用于多个列groupby与size所有对的长度和通过将其划分length of df(同指数的长度):

print (100 * df['A'].value_counts(normalize=True))
Not used    50.0
Used        50.0
Name: A, dtype: float64

print (100 * df['B'].value_counts(normalize=True))
Not used    50.0
Used        50.0
Name: B, dtype: float64

print (100 * df.groupby(['A','B']).size() / len(df.index))
A         B       
Not used  Not used    20.0
          Used        30.0
Used      Not used    30.0
          Used        20.0
dtype: float64

Run Code Online (Sandbox Code Playgroud)

如果需要过滤器值创建掩码并且get mean- Trues像1s 一样处理:

print (100 * df['A'].eq('Used').mean())
#alternative
#print (100 * (df['B'] == 'Used').mean())
50.0

print (100 * df['B'].eq('Used').mean())
#alternative
#print (100 * (df['B'] == 'Used').mean())
50.0

print (100 * (df['A'].eq('Used') & df['B'].eq('Used')).mean())
20.0

Run Code Online (Sandbox Code Playgroud)

Answer 2

Zer*_*ero 5

使用

1)使用A.

In [4929]: 100.*df.A.eq('Used').sum()/df.shape[0]
Out[4929]: 50.0

Run Code Online (Sandbox Code Playgroud)

2)使用B.

In [4930]: 100.*df.B.eq('Used').sum()/df.shape[0]
Out[4930]: 50.0

Run Code Online (Sandbox Code Playgroud)

3)使用A和使用B.

In [4931]: 100.*(df.B.eq('Used') & df.A.eq('Used')).sum()/df.shape[0]
Out[4931]: 20.0

Run Code Online (Sandbox Code Playgroud)

1)与...相同

In [4933]: 100.*(df['A'] == 'Used').sum()/len(df.index)
Out[4933]: 50.0

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，1 月前
查看次数：	2262 次
最近记录：	8 年，1 月前