aid*_*att 3 python replace pandas
我有一个日期框架,有很多行,有一些低频值.我需要进行逐列计数,然后在频率小于3时更改值.
DF-输入
Col1 Col2 Col3 Col4
1 apple tomato apple
1 apple potato nan
1 apple tomato banana
1 apple tomato banana
1 apple tomato banana
1 apple tomato banana
1 grape tomato banana
1 pear tomato banana
1 lemon tomato burger
Run Code Online (Sandbox Code Playgroud)
DF-输出
Col1 Col2 Col3 Col4
1 apple tomato Other
1 apple Other nan
1 apple tomato banana
1 apple tomato banana
1 apple tomato banana
1 apple tomato banana
1 Other tomato banana
1 Other tomato banana
1 Other tomato Other
Run Code Online (Sandbox Code Playgroud)
您可以使用where
具有value_counts
:
df.where(df.apply(lambda x: x.groupby(x).transform('count')>2), 'Other')
Run Code Online (Sandbox Code Playgroud)
输出:
Col2 Col3 Col4
Col1
1 apple tomato Other
1 apple Other banana
1 apple tomato banana
1 apple tomato banana
1 apple tomato banana
1 apple tomato banana
1 Other tomato banana
1 Other tomato banana
1 Other tomato Other
Run Code Online (Sandbox Code Playgroud)
d = df.apply(lambda x: x.groupby(x).transform('count'))
df.where(d.gt(2.0).where(d.notnull()).astype(bool), 'Other')
Run Code Online (Sandbox Code Playgroud)
输出:
Col2 Col3 Col4
Col1
1 apple tomato Other
1 apple Other NaN
1 apple tomato banana
1 apple tomato banana
1 apple tomato banana
1 apple tomato banana
1 Other tomato banana
1 Other tomato banana
1 Other tomato Other
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
52 次 |
最近记录: |