tay*_*ift 8 arrays performance generator swift
这与此问题有关,假设使用生成器(迭代器)遍历嵌套数组将是迭代元素的最佳选择,只要您不需要存储结果,同时使用重复数组如果你只想平整阵列,串联是最好的.
但是,我决定做一些测试,并实现这个函数(用懒惰和存储形式展平[Any]包含Ints或[Int]s 的数组),结果表明存储的形式更快,即使只是用于迭代元素!这意味着,不知何故,迭代生成器比在内存中构造新数组花费更多的时间,然后迭代它.
令人难以置信的是,它甚至比同一程序的python实现慢约5-70%,随着输入的减少而恶化.斯威夫特是用-O旗帜建造的.
这里有三个测试用例1.小输入,混合; 2.大输入,[Int]显性,3.大输入,Int显性:
let array1: [Any] = [Array(1...100), Array(101...105), 106,
Array(107...111), 112, 113, 114, Array(115...125)]
let array2: [Any] = Array(repeating: Array(1...5), count: 2000)
let array3: [Any] = Array(repeating: 31, count: 10000)
Run Code Online (Sandbox Code Playgroud)
A1 = [list(range(1, 101)), list(range(101, 106)), 106,
list(range(107, 112)), 112, 113, 114, list(range(115, 126))]
A2 = list(range(1, 6)) * 2000
A3 = [31] * 10000
Run Code Online (Sandbox Code Playgroud)
生成器和数组构建器:
func chain(_ segments: [Any]) -> AnyIterator<Int>{
var i = 0
var j = 0
return AnyIterator<Int> {
while i < segments.count {
switch segments[i] {
case let e as Int:
i += 1
return e
case let E as [Int]:
if j < E.count {
let val = E[j]
j += 1
return val
}
j = 0
i += 1
default:
return nil
}
}
return nil
}
}
func flatten_array(_ segments: [Any]) -> [Int] {
var result = [Int]()
for segment in segments {
switch segment {
case let segment as Int:
result.append(segment)
case let segment as [Int]:
result.append(contentsOf: segment)
default:
break
}
}
return result
}
Run Code Online (Sandbox Code Playgroud)
def chain(L):
for i in L:
if type(i) is int:
yield i
elif type(i) is list:
yield from i
def flatten_list(L):
result = []
for i in L:
if type(i) is int:
result.append(i)
elif type(i) is list:
result.extend(i)
return result
Run Code Online (Sandbox Code Playgroud)
基准测试结果(第一个测试用例为100000个循环,其他为1000个循环):
test case 1 (small mixed input)
Filling an array : 0.068221092224121094 s
Filling an array, and looping through it : 0.074559926986694336 s
Looping through a generator : 1.5902719497680664 s *
Materializing the generator to an array : 1.759943962097168 s *
test case 2 (large input, [Int] s)
Filling an array : 0.20634698867797852 s
Filling an array, and looping through it : 0.21031379699707031 s
Looping through a generator : 1.3505551815032959 s *
Materializing the generator to an array : 1.4733860492706299 s *
test case 3 (large input, Int s)
Filling an array : 0.27392101287841797 s
Filling an array, and looping through it : 0.27670192718505859 s
Looping through a generator : 0.85304021835327148 s
Materializing the generator to an array : 1.0027849674224854 s *
Run Code Online (Sandbox Code Playgroud)
test case 1 (small mixed input)
Filling an array : 0.1622014045715332 s
Filling an array, and looping through it : 0.4312894344329834 s
Looping through a generator : 0.6839139461517334 s
Materializing the generator to an array : 0.5300459861755371 s
test case 2 (large input, [int] s)
Filling an array : 1.029205083847046 s
Filling an array, and looping through it : 1.2195289134979248 s
Looping through a generator : 1.0876803398132324 s
Materializing the generator to an array : 0.8958714008331299 s
test case 3 (large input, int s)
Filling an array : 1.0181667804718018 s
Filling an array, and looping through it : 1.244570255279541 s
Looping through a generator : 1.1220412254333496 s
Materializing the generator to an array : 0.9486079216003418 s
Run Code Online (Sandbox Code Playgroud)
显然,Swift非常非常擅长构建数组.但是为什么它的生成器在某些情况下如此慢,甚至比Python慢?(用*表中的a 标记.)使用极大输入(> 100,000,000个元素,几乎崩溃Swift)的测试表明,即使在极限情况下,发生器也会比阵列填充更慢,至少为3.25.最好的情况.
如果这是语言的内在特征,它会产生一些有趣的含义.例如,常识(对我来说,无论如何都是python程序员)如果我们试图合成一个不可变对象(比如一个字符串),我们应该首先将源提供给一个生成函数来展开它,然后手将输出关闭到一个joined()适用于单个浅序列的方法.相反,看起来最有效的策略是通过数组进行序列化; 将源展开到中间数组,然后构造数组的输出.
正在构建一个完整的新数组,然后通过它迭代它比原始数组上的延迟迭代更快?为什么?
这是测试代码:
func time(test_array: [Any], cycles: Int = 1000000) -> (array_iterate: Double,
array_store : Double,
generate_iterate: Double,
generate_store: Double) {
func start() -> Double { return Date().timeIntervalSince1970 }
func lap(_ t0: Double) -> Double {
return Date().timeIntervalSince1970 - t0
}
var t0 = start()
for _ in 0..<cycles {
for e in flatten_array(test_array) { e + 1 }
}
let ?E1 = lap(t0)
t0 = start()
for _ in 0..<cycles {
let array: [Int] = flatten_array(test_array)
}
let ?E2 = lap(t0)
t0 = start()
for _ in 0..<cycles {
let G = chain(test_array)
while let g = G.next() { g + 1 }
}
let ?G1 = lap(t0)
t0 = start()
for _ in 0..<cycles {
let array: [Int] = Array(chain(test_array))
}
let ?G2 = lap(t0)
return (?E1, ?E2, ?G1, ?G2)
}
print(time(test_array: array1, cycles: 100000))
print(time(test_array: array2, cycles: 1000))
print(time(test_array: array3, cycles: 1000))
Run Code Online (Sandbox Code Playgroud)
def time_f(test_array, cycles = 1000000):
lap = lambda t0: time() - t0
t0 = time()
for _ in range(cycles):
for e in flatten_list(test_array):
e + 1
?E1 = lap(t0)
t0 = time()
for _ in range(cycles):
array = flatten_list(test_array)
?E2 = lap(t0)
t0 = time()
for _ in range(cycles):
for g in chain(test_array):
g + 1
?G1 = lap(t0)
t0 = time()
for _ in range(cycles):
array = list(chain(test_array))
?G2 = lap(t0)
return ?E1, ?E2, ?G1, ?G2
print(time_f(A1, cycles=100000))
print(time_f(A3, cycles=1000))
print(time_f(A2, cycles=1000))
Run Code Online (Sandbox Code Playgroud)
您问“为什么它的 [Swift] 生成器如此慢,在某些情况下甚至比 Python\xe2\x80\x99s 还慢?”
\n\n我的回答是,我认为它们并不像您的结果所显示的那么慢。特别是,我将尝试证明循环遍历迭代器应该比为所有测试用例构建数组更快。
\n\n在早期的工作中(参见相关博客文章http://lemire.me/blog/2016/09/22/swift-versus-java-the-bitset-performance-test/),我发现 Swift 迭代器大约有一半处理位集类时与 Java 中的同等速度一样快。这不太好,但 Java 在这方面非常高效。与此同时,Go 的表现更糟。我向您提出,Swift 迭代器的效率可能不理想,但它们的效率可能是原始 C 代码的两倍之内。性能差距可能与 Swift 中的函数内联不足有关。
\n\n我发现您正在使用AnyIterator. 我建议派生一个structfrom IteratorProtocol,它的好处是确保不必进行任何动态调度。这是一个相对有效的可能性:
public struct FastFlattenIterator: IteratorProtocol {\n let segments: [Any]\n var i = 0 // top-level index\n var j = 0 // second-level index\n var jmax = 0 // essentially, this is currentarray.count, but we buffer it\n var currentarray : [Int]! // quick reference to an int array to be flatten\n\n init(_ segments: [Any]) {\n self.segments = segments\n }\n\n public mutating func next() -> Int? {\n if j > 0 { // we handle the case where we iterate within an array separately\n let val = currentarray[j]\n j += 1\n if j == jmax {\n j = 0\n i += 1\n }\n return val\n }\n while i < segments.count {\n switch segments[i] {\n case let e as Int: // found an integer value\n i += 1\n return e\n case let E as [Int]: // first encounter with an array\n jmax = E.count\n currentarray = E\n if jmax > 0 {\n j = 1\n return E[0]\n }\n i += 1\n default:\n return nil\n }\n }\n return nil\n }\n}\nRun Code Online (Sandbox Code Playgroud)\n\n通过这堂课,我得到了以下数字。对于每个测试用例,前四种方法取自代码示例,而后两种(快速迭代器)是使用新结构构建的。请注意,“循环快速迭代器”始终是最快的。
\n\ntest case 1 (small mixed input)\nFilling an array : 0.0073099999999999997 ms\nFilling an array, and looping through it : 0.0069870000000000002 ms\nLooping through a generator : 0.18385799999999999 ms \nMaterializing the generator to an array : 0.18745700000000001 ms \nLooping through a fast iterator : 0.005372 ms \nMaterializing the fast iterator : 0.015883999999999999 ms\n\ntest case 2 (large input, [Int] s)\nFilling an array : 2.125931 ms\nFilling an array, and looping through it : 2.1169820000000001 ms\nLooping through a generator : 15.064767 ms \nMaterializing the generator to an array : 15.45152 ms \nLooping through a fast iterator : 1.572919 ms\nMaterializing the fast iterator : 1.964912 ms \n\ntest case 3 (large input, Int s)\nFilling an array : 2.9140269999999999 ms\nFilling an array, and looping through it : 2.9064290000000002 ms\nLooping through a generator : 9.8297640000000008 ms\nMaterializing the generator to an array : 9.8297640000000008 ms \nLooping through a fast iterator : 1.978038 ms \nMaterializing the fast iterator : 2.2565339999999998 ms \nRun Code Online (Sandbox Code Playgroud)\n\n您可以在 GitHub 上找到我的完整代码示例: https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/tree/master/extra/swift/iterators
\n