如何删除熊猫中的配对重复？

Question

如何删除熊猫中的配对重复？

Nab*_*zir 5 python duplicates dataframe pandas

我有数据集，数据集有配对重复。这是我的数据

Id    antecedent           descendant
1     one                  two
2     two                  one
3     two                  three
4     one                  three
5     three                two

Run Code Online (Sandbox Code Playgroud)

这是我需要的，因为one, two等于two, one所以我想删除重复的对

Id    antecedent           descendant
1     one                  two
3     two                  three
4     one                  three

Run Code Online (Sandbox Code Playgroud)

Answer 1

jez*_*ael 5

使用numpy.sortfor duplicatedboolean mask对每行进行排序：

df1 = pd.DataFrame(np.sort(df[['antecedent','descendant']], axis=1))

Run Code Online (Sandbox Code Playgroud)

或者：

#slowier solution
#df1 = df[['antecedent','descendant']].apply(frozenset, 1)

Run Code Online (Sandbox Code Playgroud)

df = df[~df1.duplicated()]
print (df)
   Id antecedent descendant
0   1        one        two
2   3        two      three
3   4        one      three

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，7 月前
查看次数：	1309 次
最近记录：	7 年，7 月前