poP*_*lor 0 python performance for-loop python-3.x
我在python3.5中运行此代码以查找Concordance(逻辑回归).
for i in (ones2.index):
for j in (zeros2.index):
pairs_tested = pairs_tested+1
if(ones2.iloc[i,1] > zeros2.iloc[j,1]):
conc = conc+1
elif(ones2.iloc[i,1]==zeros2.iloc[j,1]):
ties = ties+1
else:
disc = disc+1
# Calculate concordance, discordance and ties
concordance = conc/pairs_tested
discordance = disc/pairs_tested
ties_perc = ties/pairs_tested
print("Concordance = %r", concordance)
print("Discordance = %r", discordance)
print("Tied = %r", ties_perc)
print("Pairs = %r", pairs_tested)
Run Code Online (Sandbox Code Playgroud)
有在0.15mln行zeros2(熊猫据帧)和36K行ones2(熊猫数据帧).两个表都有两个变量
[i]响应者(在0中,Responder0 = 0,在ones2中为Responders1 = 1).
[ii]概率(在0中的prob0和在ones2中的prob1).
我的问题是: for循环耗时12小时,并且在询问此问题时仍在运行.需要帮忙.如何更快地执行此操作.我在带有8GB RAM的Windows 64bit机器上运行它.
由于两个for循环(0.15 mil*36k),您的代码正在进行54亿次计算:
我会做这样的事情:(感谢@Leon帮助我更好地回答这个问题)
from bisect import bisect_left, bisect_right
zeros_list = sorted([zeros2.iloc[j,1] for j in zeros2.index])
zeros2_length = len(zeros2_list)
for i in ones2.index:
cur_disc = bisect_left(zeros2_list, ones2.iloc[i,1])
cur_ties = bisect_right(zeros2_list, ones2.iloc[i,1]) - cur_disc
disc += cur_disc
ties += cur_ties
conc += zeros2_length - cur_ties - cur_disc
pairs_tested = zeros2_length * len(ones2.index)
concordance = conc/pairs_tested
discordance = disc/pairs_tested
ties_perc = ties/pairs_tested
print("Concordance = %r", concordance)
print("Discordance = %r", discordance)
print("Tied = %r", ties_perc)
print("Pairs = %r", pairs_tested
Run Code Online (Sandbox Code Playgroud)
或者,反过来,像这样:
zeros_list = sorted([zeros2.iloc[j,1] for j in zeros2.index])
ones2_list = sorted([ones2.iloc[i,1] for i in ones2.index])
zeros2_length = len(zeros2_list)
ones2_length = len(ones2_list)
for i in zeros2.index:
cur_conc = bisect_left(ones2_list, zeros2.iloc[i,1])
cur_ties = bisect_right(ones2_list, zeros2.iloc[i,1]) - cur_conc
conc += cur_conc
ties += cur_ties
disc += ones2_length - cur_ties - cur_conc
# We could also achieve the above like this too:
# for i in zeros2_list:
# cur_conc = bisect_left(ones2_list, i)
# cur_ties = bisect_right(ones2_list, i) - cur_conc
# conc += cur_conc
# ties += cur_ties
# disc += ones2_length - cur_ties - cur_conc
pairs_tested = zeros2_length * ones2_length
concordance = conc/pairs_tested
discordance = disc/pairs_tested
ties_perc = ties/pairs_tested
print("Concordance = %r", concordance)
print("Discordance = %r", discordance)
print("Tied = %r", ties_perc)
print("Pairs = %r", pairs_tested
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
951 次 |
| 最近记录: |