Mar*_*k K 2 python dataframe pandas
如下所示的表格,我想从中创建一个新表格(使用“颜色”列中的值)。
我试过了:
import pandas as pd
import functools
data = {'Seller': ["Mike","Mike","Mike","Mike","David","David","Pete","Pete","Pete"],
'Code' : ["9QBR1","9QBR1","9QBW2","9QBW2","9QD1X","9QD1X","9QEBO","9QEBO","9QEBO"],
'From': ["2020-01-03","2020-01-03","2020-01-03","2020-01-03","2020-01-03","2020-01-03","2020-01-03","2020-01-03","2020-01-03"],
'Color_date' : ["2020-02-14","2020-02-14","2020-05-18","2020-05-18","2020-01-04","2020-01-04","2020-03-04","2020-03-13","2020-01-28"],
'Color' : ["Blue","Red","Red","Grey","Red","Grey","Blue","Orange","Red"],
'Delivery' : ["Nancy","Nancy","Kate","Kate","Lilly","Lilly","John","John","John"]}
df = pd.DataFrame(data)
df_1 = df.set_index([df.index, 'Color'])['Color_date'].unstack()
df_1['Code'] = df['Code']
final_df = functools.reduce(lambda left,right: pd.merge(left,right,on='Code'), [df, df_1])
Run Code Online (Sandbox Code Playgroud)
“df_1”看起来不错,但“final_df”比预期的要长得多。
哪里出错了,我该如何纠正?谢谢你。
使用DataFrame.join与append=True在DataFrame.set_index为添加新列index:
df_1 = df.join(df.set_index('Color', append=True)['Color_date'].unstack())
print (df_1)
Seller Code From Color_date Color Delivery Blue \
0 Mike 9QBR1 2020-01-03 2020-02-14 Blue Nancy 2020-02-14
1 Mike 9QBR1 2020-01-03 2020-02-14 Red Nancy NaN
2 Mike 9QBW2 2020-01-03 2020-05-18 Red Kate NaN
3 Mike 9QBW2 2020-01-03 2020-05-18 Grey Kate NaN
4 David 9QD1X 2020-01-03 2020-01-04 Red Lilly NaN
5 David 9QD1X 2020-01-03 2020-01-04 Grey Lilly NaN
6 Pete 9QEBO 2020-01-03 2020-03-04 Blue John 2020-03-04
7 Pete 9QEBO 2020-01-03 2020-03-13 Orange John NaN
8 Pete 9QEBO 2020-01-03 2020-01-28 Red John NaN
Grey Orange Red
0 NaN NaN NaN
1 NaN NaN 2020-02-14
2 NaN NaN 2020-05-18
3 2020-05-18 NaN NaN
4 NaN NaN 2020-01-04
5 2020-01-04 NaN NaN
6 NaN NaN NaN
7 NaN 2020-03-13 NaN
8 NaN NaN 2020-01-28
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
49 次 |
| 最近记录: |