daw*_*awg 4 python arrays performance swift swift2
在Python中,可以有一个列表(类似于swift中的数组):
>>> li=[0,1,2,3,4,5]
Run Code Online (Sandbox Code Playgroud)
并在列表的任何/所有列表上执行切片分配:
>>> li[2:]=[99] # note then end index is not needed if you mean 'to the end'
>>> li
[0, 1, 99]
Run Code Online (Sandbox Code Playgroud)
Swift有一个类似的切片赋值(这在swift交互式shell中):
1> var arr=[0,1,2,3,4,5]
arr: [Int] = 6 values {
[0] = 0
[1] = 1
[2] = 2
[3] = 3
[4] = 4
[5] = 5
}
2> arr[2...arr.endIndex-1]=[99]
3> arr
$R0: [Int] = 3 values {
[0] = 0
[1] = 1
[2] = 99
}
Run Code Online (Sandbox Code Playgroud)
到现在为止还挺好.但是,有几个问题.
首先,swift不适用于空列表,或者索引是否在endIndex.如果切片索引在结束索引之后,则Python附加:
>>> li=[] # empty
>>> li[2:]=[6,7,8]
>>> li
[6, 7, 8]
>>> li=[0,1,2]
>>> li[999:]=[999]
>>> li
[0, 1, 2, 999]
Run Code Online (Sandbox Code Playgroud)
swift中的等价物是一个错误:
4> var arr=[Int]()
arr: [Int] = 0 values
5> arr[2...arr.endIndex-1]=[99]
fatal error: Can't form Range with end < start
Run Code Online (Sandbox Code Playgroud)
这很容易测试和编码.
第二个问题是杀手:它的速度非常慢.考虑这个Python代码来执行浮点列表的精确求和:
def msum(iterable):
"Full precision summation using multiple floats for intermediate values"
# Rounded x+y stored in hi with the round-off stored in lo. Together
# hi+lo are exactly equal to x+y. The inner loop applies hi/lo summation
# to each partial so that the list of partial sums remains exact.
# Depends on IEEE-754 arithmetic guarantees. See proof of correctness at:
# www-2.cs.cmu.edu/afs/cs/project/quake/public/papers/robust-arithmetic.ps
partials = [] # sorted, non-overlapping partial sums
for x in iterable:
i = 0
for y in partials:
if abs(x) < abs(y):
x, y = y, x
hi = x + y
lo = y - (hi - x)
if lo:
partials[i] = lo
i += 1
x = hi
partials[i:] = [x]
return sum(partials, 0.0)
Run Code Online (Sandbox Code Playgroud)
它的工作原理是保持一个hi/lo部分求和,以便精确地msum([.1]*10)产生1.0而不是0.9999999999999999.C等价物msum是Python中数学库的一部分.
我试图在swift中复制:
func msum(it:[Double])->Double {
// Full precision summation using multiple floats for intermediate values
var partials=[Double]()
for var x in it {
var i=0
for var y in partials{
if abs(x) < abs(y){
(x, y)=(y, x)
}
let hi=x+y
let lo=y-(hi-x)
if abs(lo)>0.0 {
partials[i]=lo
i+=1
}
x=hi
}
// slow part trying to replicate Python's slice assignment partials[i:]=[x]
if partials.endIndex>i {
partials[i...partials.endIndex-1]=[x]
}
else {
partials.append(x)
}
}
return partials.reduce(0.0, combine: +)
}
Run Code Online (Sandbox Code Playgroud)
测试功能和速度:
import Foundation
var arr=[Double]()
for _ in 1...1000000 {
arr+=[10, 1e100, 10, -1e100]
}
print(arr.reduce(0, combine: +)) // will be 0.0
var startTime: CFAbsoluteTime!
startTime = CFAbsoluteTimeGetCurrent()
print(msum(arr), arr.count*5) // should be arr.count * 5
print(CFAbsoluteTimeGetCurrent() - startTime)
Run Code Online (Sandbox Code Playgroud)
在我的机器上,需要7秒才能完成.Python原生msum需要2.2秒(大约快4倍),库fsum函数需要0.09秒(几乎快90倍)
我试图替换partials[i...partials.endIndex-1]=[x],arr.removeRange(i..<arr.endIndex)然后追加.快一点但不多.
题:
partials[i...partials.endIndex-1]=[x]首先(正如在评论中已经说过的),Swift中的非优化代码和优化代码之间存在巨大差异("-Onone"vs"-O"编译器选项,或Debug与Release配置),因此对于性能测试make确保选中"Release"配置.(如果使用Instruments分析代码,"Release"也是默认配置).
使用半开范围有一些优点:
var arr = [0,1,2,3,4,5]
arr[2 ..< arr.endIndex] = [99]
print(arr) // [0, 1, 99]
Run Code Online (Sandbox Code Playgroud)
事实上,这是一个怎样的范围在内部存储,它可以让你插入片末的数组(但不超过作为在Python):
var arr = [Int]()
arr[0 ..< arr.endIndex] = [99]
print(arr) // [99]
Run Code Online (Sandbox Code Playgroud)
所以
if partials.endIndex > i {
partials[i...partials.endIndex-1]=[x]
}
else {
partials.append(x)
}
Run Code Online (Sandbox Code Playgroud)
相当于
partials[i ..< partials.endIndex] = [x]
// Or: partials.replaceRange(i ..< partials.endIndex, with: [x])
Run Code Online (Sandbox Code Playgroud)
但是,这不是性能提升.似乎在Swift中替换切片很慢.截断数组并附加新元素
partials.replaceRange(i ..< partials.endIndex, with: [])
partials.append(x)
Run Code Online (Sandbox Code Playgroud)
在我的计算机上将测试代码的时间从大约1.25秒减少到0.75秒.