删除熊猫记录中的重复值

Bob*_*itz 2 python duplicates distinct-values dataframe

我想删除列动物每行中的重复项。

我需要类似这篇文章的内容,但需要使用python。由于某种原因,我现在无法解决此问题,并且遇到了障碍。

删除数据框中的重复记录

我试过使用掉落重复项,唯一性,唯一性等。没有运气。

df.drop_duplicates(subset = None,keep =“ first”,inplace = False)df


df = pd.DataFrame ({'animals':['pink pig, pink pig, pink pig','brown cow, brown cow','pink pig, black cow','brown horse, pink pig, brown cow, black cow, brown cow']})

#input:
    animals
0   pink pig, pink pig, pink pig
1   brown cow, brown cow
2   pink pig, black cow
3   brown horse, pink pig, brown cow, black cow, brown cow

Run Code Online (Sandbox Code Playgroud)
#I would like the output to look like this:
    animals
0   pink pig
1   brown cow
2   pink pig, black cow
3   brown horse, pink pig, brown cow, black cow

Run Code Online (Sandbox Code Playgroud)

Jua*_*n C 7

这样做:

df = pd.DataFrame ({'animals':['pink pig, pink pig, pink pig','brown cow, brown cow','pink pig, black cow','brown horse, pink pig, brown cow, black cow, brown cow']})


df['animals2'] = df.animals.apply(lambda x: ', '.join(list(set(x.split(', ')))))
Run Code Online (Sandbox Code Playgroud)

输出:

0                                       pink pig
1                                      brown cow
2                            pink pig, black cow
3    brown cow, brown horse, pink pig, black cow
Run Code Online (Sandbox Code Playgroud)

说明:

我把你的琴弦变成一个清单。然后,我将列表变成一个集合,以删除重复项。然后,我将集合变成一个列表,然后我将列表拆分成一个字符串。如果不清楚,请告诉我!