我试图用四个键将数据帧转换成字典,这些键全部来自列。我也有多个列,我想通过使用由这四个列构建的键来返回值。我用循环的方式工作,但最终遇到内存错误。我很好奇,有没有更有效的方法呢?
数据框如下所示:
Service Bill Weight Zone Resi UPS FedEx USPS DHL
1DEA 1 2 N 33.02 9999 9999 9999
1DEA 2 2 N 33.02 9999 9999 9999
1DEA 3 2 N 33.02 9999 9999 9999
Run Code Online (Sandbox Code Playgroud)
我想为每个运营商都有一个像这样的钥匙:
price[('1DEA', '1', '2', 'N', 'UPS')]=33.02
price[('1DEA', '1', '2', 'N', 'FedEx')]=9999
Run Code Online (Sandbox Code Playgroud)
我已经试过了:
price = {}
carriers = ['UPS', 'FedEx', 'USPS','DHL']
for carrier in carriers:
for row in rate_keys.to_dict('records'):
key = (row['Service'], row['Bill Weight'], row['Zone'],
row['Resi'], carrier)
rate_keys[key] = row[carrier]
Run Code Online (Sandbox Code Playgroud) 我有多个简单的函数需要在我的数据帧的某些列的每一行上实现.数据帧很像,1000万+行.我的数据框是这样的:
Date location city number value
12/3/2018 NY New York 2 500
12/1/2018 MN Minneapolis 3 600
12/2/2018 NY Rochester 1 800
12/3/2018 WA Seattle 2 400
Run Code Online (Sandbox Code Playgroud)
我有这样的功能:
def normalized_location(row):
if row['city'] == " Minneapolis":
return "FCM"
elif row['city'] == "Seattle":
return "FCS"
else:
return "Other"
Run Code Online (Sandbox Code Playgroud)
然后我用:
df['Normalized Location'] =df.apply (lambda row: normalized_location (row),axis=1)
Run Code Online (Sandbox Code Playgroud)
这非常慢,我怎样才能提高效率呢?