如何将两个数据帧与"通配符"合并?

Mat*_*hew 6 python pandas

我有一个简单的数据帧,如下所示:

   p     b
0  a   buy
1  b   buy
2  a  sell
3  b  sell
Run Code Online (Sandbox Code Playgroud)

和这样的查找表:

   p     b    v
0  a   buy  123
1  a  sell  456
2  a     *  888
4  b     *  789
Run Code Online (Sandbox Code Playgroud)

如何合并(加入)两个数据帧,但是尊重列b中的"通配符",即预期的结果是:

   p     b    v
0  a   buy  123
1  b   buy  789
2  a  sell  456
3  b  sell  789
Run Code Online (Sandbox Code Playgroud)

我能想到的最好的就是这个,但它非常丑陋且冗长:

data = pd.DataFrame([
        ['a', 'buy'],
        ['b', 'buy'],         
        ['a', 'sell'],
        ['b', 'sell'],              
    ], columns = ['p', 'b'])
lookup = pd.DataFrame([
        ['a', 'buy', 123],
        ['a', 'sell', 456],
        ['a', '*', 888],
        ['b', '*', 789],        
], columns = ['p','b', 'v'])

x = data.reset_index()
y1 = pd.merge(x, lookup, on=['p', 'b'], how='left').set_index('index')
y2 = pd.merge(x[y1['v'].isnull()], lookup, on=['p'], how='left' ).set_index('index')
data['v'] = y1['v'].fillna(y2['v'])
Run Code Online (Sandbox Code Playgroud)

有更聪明的方法吗?

And*_*den 6

我认为清洁wildcards第一个有点清洁:

In [11]: wildcards = lookup[lookup["b"] == "*"]

In [12]: wildcards.pop("b")  # ditch the * column, it'll confuse the later merge
Run Code Online (Sandbox Code Playgroud)

现在,您可以将两个合并(无需set_index)与update:

In [13]: res = df.merge(lookup, how="left")

In [14]: res
Out[14]:
   p     b      v
0  a   buy  123.0
1  b   buy    NaN
2  a  sell  456.0
3  b  sell    NaN

In [15]: res.update(df.merge(wildcards, how="left"), overwrite=False)

In [16]: res
Out[16]:
   p     b      v
0  a   buy  123.0
1  b   buy  789.0
2  a  sell  456.0
3  b  sell  789.0
Run Code Online (Sandbox Code Playgroud)