Rog*_*ger 5 python merge dataframe pandas
我想合并2个数据帧:
DF1:
cik0 cik1 cik2
'MKTG, INC.' 0001019056 None None
1 800 FLOWERS COM INC 0001104659 0001437749 None
11 GOOD ENERGY INC 0000930413 None None
1347 CAPITAL CORP 0001144204 None None
1347 PROPERTY INSURANCE HOLDINGS, INC. 0001387131 None None
Run Code Online (Sandbox Code Playgroud)
DF2:
cik Ticker
0 0001144204 AABB
1 0001019056 A
2 0001387131 AABC
3 0001437749 AA
4 0000930413 AAACU
Run Code Online (Sandbox Code Playgroud)
预期结果:
cik0 cik1 cik2 ticker
'MKTG, INC.' 0001019056 None None A
1 800 FLOWERS COM INC 0001104659 0001437749 None AA
11 GOOD ENERGY INC 0000930413 None None AAACU
1347 CAPITAL CORP 0001144204 None None AABB
1347 PROPERTY INSURANCE HOLDINGS, INC. 0001387131 None None AABC
Run Code Online (Sandbox Code Playgroud)
我会匹配cik0
与df2['cik']
,如果它不工作,我想看看cik1
,等等.
谢谢你的帮助!
您可以使用pd.Series.map
withfillna
几次:
ticker_map = df2.set_index('cik')['Ticker']
df1['ticker'] = df1['cik0'].map(ticker_map)\
.fillna(df1['cik1'].map(ticker_map))\
.fillna(df1['cik2'].map(ticker_map))
Run Code Online (Sandbox Code Playgroud)
然而,这有点乏味。您可以定义一个函数来迭代执行此操作:
def apply_map_on_cols(df, cols, mapper):
s = df[cols[0]].map(mapper)
for col in cols[1:]:
s = s.fillna(df[col].map(mapper))
return s
df1['ticker'] = df.pipe(apply_map_on_cols,
cols=[f'cik{i}' for i in range(3)],
mapper=df2.set_index('cik')['Ticker'])
Run Code Online (Sandbox Code Playgroud)