Kyl*_*yle 3 python python-3.x pandas
我有一个如下所示的DataFrame.假设这些是销售人员列表的销售额.
此外,我有一个查找表,其中包含按金额计算的佣金.这看起来如下.所以,$ 0- $ 50,000 = 5%,$ 50,001- $ 250,000 = 4%等.
我想要做的是将查找表应用于sales表以生成下面的DataFrame.
尝试1:
In [66]: a
Out[66]:
Sales_1 Sales_2 Sales_3
0 200000 300000 100000
1 100000 500000 500000
2 400000 1000000 200000
In [67]: b
Out[67]:
Commission
Sales
50000 0.05
250000 0.04
750000 0.03
9999999999 0.02
In [68]: c = b['Commission'][a <= b.index.values]
Traceback (most recent call last):
File "<ipython-input-68-d229bce29f01>", line 1, in <module>
c = b['Commission'][a <= b.index.values]
File "C:\WinPython64bit\python-3.5.2.amd64\lib\site-packages\pandas\core\ops.py", line 1184, in f
res = self._combine_const(other, func, raise_on_error=False)
File "C:\WinPython64bit\python-3.5.2.amd64\lib\site-packages\pandas\core\frame.py", line 3555, in _combine_const
raise_on_error=raise_on_error)
File "C:\WinPython64bit\python-3.5.2.amd64\lib\site-packages\pandas\core\internals.py", line 2911, in eval
return self.apply('eval', **kwargs)
File "C:\WinPython64bit\python-3.5.2.amd64\lib\site-packages\pandas\core\internals.py", line 2890, in apply
applied = getattr(b, f)(**kwargs)
File "C:\WinPython64bit\python-3.5.2.amd64\lib\site-packages\pandas\core\internals.py", line 1132, in eval
result = get_result(other)
File "C:\WinPython64bit\python-3.5.2.amd64\lib\site-packages\pandas\core\internals.py", line 1103, in get_result
result = func(values, other)
ValueError: operands could not be broadcast together with shapes (3,3) (4,)
Run Code Online (Sandbox Code Playgroud)
尝试2:
In [59]: a
Out[59]:
Sales_1 Sales_2 Sales_3
0 200000 300000 100000
1 100000 500000 500000
2 400000 1000000 200000
In [60]: b
Out[60]:
Commission
Sales
50000 0.05
250000 0.04
750000 0.03
9999999999 0.02
In [61]: c = b.lookup(a['Sales_1'],['Commission'])
Traceback (most recent call last):
File "<ipython-input-61-99e8134e826c>", line 1, in <module>
c = b.lookup(a['Sales_1'],['Commission'])
File "C:\WinPython64bit\python-3.5.2.amd64\lib\site-packages\pandas\core\frame.py", line 2649, in lookup
raise ValueError('Row labels must have same size as column labels')
ValueError: Row labels must have same size as column labels
Run Code Online (Sandbox Code Playgroud)
任何人都可以帮我将查找表应用于DataFrame吗?它不一定非常像这样,但这说明了我的一般需求.
要与范围合作,pd.cut是你的朋友.根据您当前的b数据帧,您只需修改作为参数传递的bin列表以定义最低范围.在这里,我把0负的销售不存在,但你可以把任何负数太多,如果需要的话,甚至处理-np.inf和np.inf代替1E14你的下限和上限:
pd.cut(a.stack(), [0] + b.Sales.tolist(), labels=b.Commission).unstack()
Out[39]:
Sales_1 Sales_2 Sales_3
0 0.04 0.03 0.04
1 0.04 0.03 0.03
2 0.03 0.02 0.04
Run Code Online (Sandbox Code Playgroud)
我发现b下面更清楚地用于切割:
Sales Commission
0 -inf NaN
1 50000 0.05
2 250000 0.04
3 750000 0.03
4 inf 0.02
Run Code Online (Sandbox Code Playgroud)
争论成为:
pd.cut(a.stack(), b.Sales, labels=b.Commission[1:]).unstack()
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
405 次 |
| 最近记录: |