使用'lookup'乘以两个python数组

Car*_*arl 2 python dataframe pandas

import numpy as np
import pandas as pd

columns = ['id', 'A', 'B', 'C']
index = np.arange(3)

df = pd.DataFrame(np.random.randn(3,4), columns=columns, index=index)

weights = {'A': 0.10, 'B': 1.00, 'C': 1.50}
Run Code Online (Sandbox Code Playgroud)

我需要使用相应的权重(不包括第一列)将每个"单元格"中的值复用.例如:

df.at[0,'A'] * weights['A']
df.at[0,'B'] * weights['B']
Run Code Online (Sandbox Code Playgroud)

什么是最有效的方法,并在新的DataFrame中得到结果?

All*_*len 5

建立

df
Out[1013]: 
         id         A         B         C
0 -0.641314 -0.526509  0.225116 -1.131141
1  0.018321 -0.944734 -0.123334 -0.853356
2  0.703119  0.468857  1.038572 -1.529723

weights
Out[1026]: {'A': 0.1, 'B': 1.0, 'C': 1.5}

W = np.asarray([weights[e] for e in sorted(weights.keys())])
Run Code Online (Sandbox Code Playgroud)

#use a matrix multiplication to apply the weights to each column
df.loc[:,['A','B','C']] *= W
df
Out[1016]: 
         id         A         B         C
0 -0.641314 -0.052651  0.225116 -1.696712
1  0.018321 -0.094473 -0.123334 -1.280034
2  0.703119  0.046886  1.038572 -2.294584
Run Code Online (Sandbox Code Playgroud)

更新

如果您需要保持列名灵活,我认为更好的方法是在2个列表中保存列名和权重:

columns = sorted(weights.keys())
Out[1072]: ['A', 'B', 'C']

weights = [weights[e] for e in columns]
Out[1074]: [0.1, 1.0, 1.5]
Run Code Online (Sandbox Code Playgroud)

然后你就可以这样做:

df.loc[:,columns] *=weights

Out[1067]: 
         id         A         B         C
0 -0.641314 -0.052651  0.225116 -1.696712
1  0.018321 -0.094473 -0.123334 -1.280034
2  0.703119  0.046886  1.038572 -2.294584
Run Code Online (Sandbox Code Playgroud)

一个oneliner解决方案:

df.loc[:,sorted(weights.keys())] *=[weights[e] for e in sorted(weights.keys())]

df
Out[1089]: 
         id         A         B         C
0 -0.641314 -0.052651  0.225116 -1.696712
1  0.018321 -0.094473 -0.123334 -1.280034
2  0.703119  0.046886  1.038572 -2.294584
Run Code Online (Sandbox Code Playgroud)