Ava*_*jee 1 python dataframe pandas
这是我有的东西:
list1_ = [("1","a","a1"),("1","b","b1"),("1","c","c"),("2","a","a2")]
df1 = pd.DataFrame(list1_,columns = ["user","col1","col2"])
list2_ = [("1","b","b2"),("1","a","a2"),("2","a","a3"),("1","c","c2")]
df2 = pd.DataFrame(list2_,columns = ["user","col1","col3"])
Run Code Online (Sandbox Code Playgroud)
我要做的是为df2中的(user,col1)与df1匹配,并在df1中添加col3 ...基本上为相同的单元格值生成df1:(user,col1,col2,col3).最终结果应如下所示:
list3_ = [("1","a","a1","a2"),("1","b","b1","b2"),("1","c","c","c2"),
("2","a","a2","a3")]
df3 = pd.DataFrame(list3_,columns = ["user","col1","col2","col3"])
Run Code Online (Sandbox Code Playgroud)
请注意:我从csv文件中读取df1,然后使用list2_创建df2.因此,我有一些list2_形式的数据,但不是list1_的形式.所以,想只使用df1,list2_和/或df2.
用途pd.merge:
df1.merge(df2, on = ['user','col1'])
user col1 col2 col3
0 1 a a1 a2
1 1 b b1 b2
2 1 c c c2
3 2 a a2 a3
Run Code Online (Sandbox Code Playgroud)