jon*_*nas 5 python dataframe pandas
我有一个包含交易日志的数据框。我的问题是我没有任何 ID 来匹配股票的买卖。该股票可以多次交易,我希望有一个 ID 来匹配每次完成的交易。我的原始数据帧是带有时间戳的顺序时间序列数据帧。下面的例子说明了我的问题,我需要按顺序匹配和识别交易的股票。非常简单的例子:
df1 = pd.DataFrame({'stock': ['A', 'B', 'C', 'A','C', 'A', 'A'],
'deal': ['buy', 'buy', 'buy', 'sell','sell', 'buy', 'sell']})
df1
Out[84]:
stock deal
0 A buy
1 B buy
2 C buy
3 A sell
4 C sell
5 A buy
6 A sell
Run Code Online (Sandbox Code Playgroud)
这是我想要的输出:
df1 = pd.DataFrame({'stock': ['A', 'B', 'C', 'A','C', 'A', 'A'],
'deal': ['buy', 'buy', 'buy', 'sell','sell', 'buy', 'sell'],
'ID': [1, 2, 3, 1,3, 4, 4]})
df1
Out[82]:
stock deal ID
0 A buy 1
1 B buy 2
2 C buy 3
3 A sell 1
4 C sell 3
5 A buy 4
6 A sell 4
Run Code Online (Sandbox Code Playgroud)
有任何想法吗?
尝试这个:
m = df1['deal'] == 'buy'
df1['ID'] = m.cumsum().where(m)
df1['ID'] = df1.groupby('stock')['ID'].ffill()
df1
Run Code Online (Sandbox Code Playgroud)
输出:
stock deal ID
0 A buy 1.0
1 B buy 2.0
2 C buy 3.0
3 A sell 1.0
4 C sell 3.0
5 A buy 4.0
6 A sell 4.0
Run Code Online (Sandbox Code Playgroud)
细节: