基于该答案,这里有两个版本的merge函数用于mergesort.你能帮我理解为什么第二个更快.我已经测试了它的50000列表,第二个测试速度提高了8倍(Gist).
def merge1(left, right):
i = j = inv = 0
merged = []
while i < len(left) and j < len(right):
if left[i] <= right[j]:
merged.append(left[i])
i += 1
else:
merged.append(right[j])
j += 1
inv += len(left[i:])
merged += left[i:]
merged += right[j:]
return merged, inv
Run Code Online (Sandbox Code Playgroud)
.
def merge2(array1, array2):
inv = 0
merged_array = []
while array1 or array2:
if not array1:
merged_array.append(array2.pop())
elif (not array2) or array1[-1] > array2[-1]:
merged_array.append(array1.pop())
inv += len(array2)
else:
merged_array.append(array2.pop())
merged_array.reverse()
return merged_array, inv
Run Code Online (Sandbox Code Playgroud)
这是sort函数:
def _merge_sort(list, merge):
len_list = len(list)
if len_list < 2:
return list, 0
middle = len_list / 2
left, left_inv = _merge_sort(list[:middle], merge)
right, right_inv = _merge_sort(list[middle:], merge)
l, merge_inv = merge(left, right)
inv = left_inv + right_inv + merge_inv
return l, inv
Run Code Online (Sandbox Code Playgroud)
.
import numpy.random as nprnd
test_list = nprnd.randint(1000, size=50000).tolist()
test_list_tmp = list(test_list)
merge_sort(test_list_tmp, merge1)
test_list_tmp = list(test_list)
merge_sort(test_list_tmp, merge2)
Run Code Online (Sandbox Code Playgroud)
wil*_*ill 10
类似于kreativitea上面的答案,但有更多的信息(我想!)
所以分析实际的合并函数,合并两个50K数组,
311748 function calls in 15.363 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 15.363 15.363 <string>:1(<module>)
1 15.322 15.322 15.362 15.362 merge.py:3(merge1)
221309 0.030 0.000 0.030 0.000 {len}
90436 0.010 0.000 0.010 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Run Code Online (Sandbox Code Playgroud)
250004 function calls in 0.104 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 0.104 0.104 <string>:1(<module>)
1 0.074 0.074 0.103 0.103 merge.py:20(merge2)
50000 0.005 0.000 0.005 0.000 {len}
100000 0.010 0.000 0.010 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
100000 0.014 0.000 0.014 0.000 {method 'pop' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'reverse' of 'list' objects}
Run Code Online (Sandbox Code Playgroud)
因此对于merge1,它是len221309,90436 append,并且需要15.363秒.因此,对于merge2,这是50000 len100000 append,和100000 pop,并采取0.104秒.
len并且append pop都是O(1)(更多信息在这里),所以这些配置都没有表现出什么实际花费时间,因为只是去,应该是快,但只有约20%左右.
在第一种方法中,有这一行:
inv += len(left[i:])
Run Code Online (Sandbox Code Playgroud)
因此,每次调用它时,都必须重建一个数组.如果你注释掉这一行(或者只是用它替换它inv += 1),那么它会比其他方法更快.这是负责增加时间的单一行.
注意到这是原因,可以通过改进代码来解决问题; 将其更改为此以加快速度.这样做之后,它会比merge2
inv += len(left) - i
Run Code Online (Sandbox Code Playgroud)
def merge3(left, right):
i = j = inv = 0
merged = []
while i < len(left) and j < len(right):
if left[i] <= right[j]:
merged.append(left[i])
i += 1
else:
merged.append(right[j])
j += 1
inv += len(left) - i
merged += left[i:]
merged += right[j:]
return merged, inv
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
284 次 |
| 最近记录: |