Bob*_*itz 2 python duplicates distinct-values dataframe
我想删除列动物每行中的重复项。
我需要类似这篇文章的内容,但需要使用python。由于某种原因,我现在无法解决此问题,并且遇到了障碍。
我试过使用掉落重复项,唯一性,唯一性等。没有运气。
df.drop_duplicates(subset = None,keep =“ first”,inplace = False)df
df = pd.DataFrame ({'animals':['pink pig, pink pig, pink pig','brown cow, brown cow','pink pig, black cow','brown horse, pink pig, brown cow, black cow, brown cow']})
#input:
animals
0 pink pig, pink pig, pink pig
1 brown cow, brown cow
2 pink pig, black cow
3 brown horse, pink pig, brown cow, black cow, brown cow
Run Code Online (Sandbox Code Playgroud)
#I would like the output to look like this:
animals
0 pink pig
1 brown cow
2 pink pig, black cow
3 brown horse, pink pig, brown cow, black cow
Run Code Online (Sandbox Code Playgroud)
这样做:
df = pd.DataFrame ({'animals':['pink pig, pink pig, pink pig','brown cow, brown cow','pink pig, black cow','brown horse, pink pig, brown cow, black cow, brown cow']})
df['animals2'] = df.animals.apply(lambda x: ', '.join(list(set(x.split(', ')))))
Run Code Online (Sandbox Code Playgroud)
输出:
0 pink pig
1 brown cow
2 pink pig, black cow
3 brown cow, brown horse, pink pig, black cow
Run Code Online (Sandbox Code Playgroud)
说明:
我把你的琴弦变成一个清单。然后,我将列表变成一个集合,以删除重复项。然后,我将集合变成一个列表,然后我将列表拆分成一个字符串。如果不清楚,请告诉我!