Joh*_*eer 2 python duplicates dataframe pandas
我在Py pandas中有以下DataFrame
source target value type
0 10 1200 0.500 Undirected
1 13 3333 0.600 Undirected
2 10 1200 0.500 Undirected
3 15 2300 0.350 Undirected
4 18 5300 0.250 Undirected
5 17 2300 0.100 Undirected
6 13 3333 0.600 Undirected
Run Code Online (Sandbox Code Playgroud)
from StringIO import StringIO
import pandas as pd
text=""" source target value type
0 10 1200 0.500 Undirected
1 13 3333 0.600 Undirected
2 10 1200 0.500 Undirected
3 15 2300 0.350 Undirected
4 18 5300 0.250 Undirected
5 17 2300 0.100 Undirected
6 13 3333 0.600 Undirected"""
df = pd.read_csv(StringIO(text), delim_whitespace=True, index_col=[0])
Run Code Online (Sandbox Code Playgroud)
print df[df.duplicated()]
source target value type
2 10 1200 0.5 Undirected
6 13 3333 0.6 Undirected
print df.drop_duplicates(keep=False)
source target value type
3 15 2300 0.35 Undirected
4 18 5300 0.25 Undirected
5 17 2300 0.10 Undirected
Run Code Online (Sandbox Code Playgroud)
df.duplicated() 返回重复内容的布尔掩码
df.drop_duplicates() 删除重复的行
keep=False指定删除所有已复制的行,而不是保留重复行的第一个或最后一个.pandas drop duplicates:documentation
| 归档时间: |
|
| 查看次数: |
3537 次 |
| 最近记录: |