我有两个pandas DataFrames - weight在Land Use列上有一个简单的索引.concentration有一个MultiIndex on Land Use和Parameter.
import pandas
from io import StringIO
conc_string = StringIO("""\
Land Use,Parameter,1E,1N,1S,2
Airfield,BOD5 (mg/l),0.418,0.118,0.226,1.063
Airfield,Ortho P (mg/l),0.002,0.001,0.001,0.002
Airfield,TSS (mg/l),1.773,11.47,0.862,0.183
Airfield,Zn (mg/l),0.001,0.001,4.95E-05,0.001
"Commercial",BOD5 (mg/l),0.036,0.0419,,0.315
"Commercial",Cu (mg/l),4.37E-05,7.34E-05,,0.00039
"Commercial",O&G (mg/l),0.0385,0.127,,0.263
Open Space,TSS (mg/l),0.371,3.01,1.209,0.147
Open Space,Zn (mg/l),0.0127,0.0069,0.0132,0.007
"Parking Lot",BOD5 (mg/l),0.924,0.0668,2.603,3.19
"Parking Lot",O&G (mg/l),1.02,0.149,1.347,1.88
"Rooftops",BOD5 (mg/l),0.135,1.00,0.0562,0.310""")
weight_string = StringIO("""\
Land Use,1E,1N,1S,2
Airfield,0.511,0.0227,0.0616,0.394
Commercial,0.0005,0.1704,0,0.1065
Open Space,0.0008,0.005,0.0002,0.0004
"Parking Lot",0.33,0.514,0.252,0.171
Rooftops,0.081,0.028,8.50E-05,0.003""")
concentration = pandas.read_csv(conc_string, index_col=[0,1])
weight = pandas.read_csv(weight_string, index_col=0)
Run Code Online (Sandbox Code Playgroud)
在这种情况下,柱(1E,1N,1S和2)是排水池.
我想做的是将所有浓度除以Parameter盆地的重量(柱名)和Land Use.
我在这里没有太多运气.concentration / weight当然不行.我没有太多运气堆叠数据帧和加入任何一个
wstk = pandas.DataFrame(weight.stack())
wstk.index.names = ['Land Use', 'Basin']
wstk.rename(columns={0:'weight'}, inplace=True)
cstk = pandas.DataFrame(concentration.stack())
cstk.index.names = ['Land Use', 'Parameter', 'Basin']
cstk.rename(columns={0:'concentration'}, inplace=True)
wstk.join(cstk, on=['Land Use', 'Basin']) # fails
cstk.join(wstk, on=['Land Use', 'Basin']) # fails
Run Code Online (Sandbox Code Playgroud)
当我离开onkwarg时,最后两行不会引发错误,但会返回NaN已连接列的结果.如果我在两个堆叠的DataFrame上删除索引(例如,wstk.reset_index(inplace=True)在连接之前执行),它们也会失败.
有什么建议?
使用DataFrame div方法并为要广播的多索引传递matchkey:
从以下文档div:
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level
In [39]: concentration.div(weight, level='Land Use')
Out[39]:
1E 1N 1S 2
Land Use Parameter
Airfield BOD5 (mg/l) 0.818004 5.198238 3.668831 2.697970
Ortho P (mg/l) 0.003914 0.044053 0.016234 0.005076
TSS (mg/l) 3.469667 505.286344 13.993506 0.464467
Zn (mg/l) 0.001957 0.044053 0.000804 0.002538
Commercial BOD5 (mg/l) 72.000000 0.245892 NaN 2.957746
Cu (mg/l) 0.087400 0.000431 NaN 0.003662
O&G (mg/l) 77.000000 0.745305 NaN 2.469484
Open Space TSS (mg/l) 463.750000 602.000000 6045.000000 367.500000
Zn (mg/l) 15.875000 1.380000 66.000000 17.500000
Parking Lot BOD5 (mg/l) 2.800000 0.129961 10.329365 18.654971
O&G (mg/l) 3.090909 0.289883 5.345238 10.994152
Rooftops BOD5 (mg/l) 1.666667 35.714286 661.176471 103.333333
Run Code Online (Sandbox Code Playgroud)