我有一个 df:
population plot1 plot2 plot3 plot4
0 Population1 Species1 Species1 Species2 Species2
1 Population2 Species4 Species2 Species3 Species4
2 Population3 Species1 Species2 Species1 Species2
3 Population4 Species4 Species4 Species4 Species4
4 Population5 Species2 Species2 Species4 Species2
5 Population6 Species4 Species3 Species3 Species4
6 Population7 Species3 Species4 Species1 Species3
7 Population8 Species4 Species4 Species4 Species4
8 Population9 Species3 Species4 Species2 Species3
9 Population10 Species1 Species3 Species2 Species4
10 Population11 Species2 Species4 Species2 Species4
Run Code Online (Sandbox Code Playgroud)
我想创建一个新的数据框,其中 Species4 出现不止一次的所有行(种群)都被删除。我已经尝试了几种使用.value_counts() 的方法,但无法找到一种方法将它一次应用于整个数据帧,而不仅仅是简单地循环遍历所有行(这在我拥有的大型数据集上需要很长时间) )。
所以,我试过:
dat.drop(dat.value_counts()['Species4'] > 1)
Run Code Online (Sandbox Code Playgroud)
但是.value_counts()不能应用于整个 df。
使用pandas.DataFrame.eq
:
new_df = df[df.eq('Species4').sum(1).le(1)]
# or
new_df = df[~df.eq('Species4').sum(1).gt(1)]
print(new_df)
Run Code Online (Sandbox Code Playgroud)
输出:
population plot1 plot2 plot3 plot4
0 Population1 Species1 Species1 Species2 Species2
2 Population3 Species1 Species2 Species1 Species2
4 Population5 Species2 Species2 Species4 Species2
6 Population7 Species3 Species4 Species1 Species3
8 Population9 Species3 Species4 Species2 Species3
9 Population10 Species1 Species3 Species2 Species4
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
51 次 |
最近记录: |