Ale*_*xey 4 arrays string performance objective-c swift
这是两个非常相似的Levenshtein Distance algorithms.
Swift实施:https:
//gist.github.com/bgreenlee/52d93a1d8fa1b8c1f38b
并Objective-C实施:https:
//gist.github.com/boratlibre/1593632
在swift一个是慢得多然后ObjC执行我已经派几个小时,使其速度更快,但......好像Swift阵列和Strings不一样快的操作objC.
2000年的random Strings计算Swift实施速度大约慢了100(!!!)倍ObjC.
老实说,我不知道什么可能是错的,因为即使这部分也很快
func levenshtein(aStr: String, bStr: String) -> Int {
// create character arrays
let a = Array(aStr)
let b = Array(bStr)
...
Run Code Online (Sandbox Code Playgroud)
比整个算法慢几倍 Objective C
有谁知道如何加速swift计算?
先感谢您!
附加
在所有建议的改进之后,swift代码看起来像这样.它在发布配置中比ObjC慢4倍.
import Foundation
class Array2D {
var cols:Int, rows:Int
var matrix:UnsafeMutablePointer<Int>
init(cols:Int, rows:Int) {
self.cols = cols
self.rows = rows
matrix = UnsafeMutablePointer<Int>(malloc(UInt(cols * rows) * UInt(sizeof(Int))))
for i in 0...cols*rows {
matrix[i] = 0
}
}
subscript(col:Int, row:Int) -> Int {
get {
return matrix[cols * row + col] as Int
}
set {
matrix[cols*row+col] = newValue
}
}
func colCount() -> Int {
return self.cols
}
func rowCount() -> Int {
return self.rows
}
}
extension String {
func levenshteinDistanceFromStringSwift(comparingString: NSString) -> Int {
let aStr = self
let bStr = comparingString
// let a = Array(aStr.unicodeScalars)
// let b = Array(bStr.unicodeScalars)
let a:NSString = aStr
let b:NSString = bStr
var dist = Array2D(cols: a.length + 1, rows: b.length + 1)
for i in 1...a.length {
dist[i, 0] = i
}
for j in 1...b.length {
dist[0, j] = j
}
for i in 1...a.length {
for j in 1...b.length {
if a.characterAtIndex(i-1) == b.characterAtIndex(j-1) {
dist[i, j] = dist[i-1, j-1] // noop
} else {
dist[i, j] = min(
dist[i-1, j] + 1, // deletion
dist[i, j-1] + 1, // insertion
dist[i-1, j-1] + 1 // substitution
)
}
}
}
return dist[a.length, b.length]
}
func levenshteinDistanceFromStringObjC(comparingString: String) -> Int {
let aStr = self
let bStr = comparingString
//It is really strange, but I should link Objective-C coz dramatic slow swift performance
return aStr.compareWithWord(bStr, matchGain: 0, missingCost: 1)
}
}
Run Code Online (Sandbox Code Playgroud)
malloc的?的NSString?最后4倍减速?有人需要迅速吗?
Swift代码比Objective-C代码慢的原因有很多.我通过比较两个固定字符串100次来做一个非常简单的测试用例.
第一个原因是Swift Character代表一个"扩展字形集群",它可以包含几个Unicode代码点(例如"标志").这使得字符串的分解变得缓慢.另一方面,Objective-C
NSString将字符串存储为UTF-16代码点序列.
如果你更换
let a = Array(aStr)
let b = Array(bStr)
Run Code Online (Sandbox Code Playgroud)
通过
let a = Array(aStr.utf16)
let b = Array(bStr.utf16)
Run Code Online (Sandbox Code Playgroud)
这样Swift代码也适用于UTF-16序列,然后时间下降到1.88秒.
二维数组的分配也很慢.分配单个一维数组更快.我在Array2D这里找到了一个简单的类:http:
//blog.trolieb.com/trouble-multidimensional-arrays-swift/
class Array2D {
var cols:Int, rows:Int
var matrix: [Int]
init(cols:Int, rows:Int) {
self.cols = cols
self.rows = rows
matrix = Array(count:cols*rows, repeatedValue:0)
}
subscript(col:Int, row:Int) -> Int {
get {
return matrix[cols * row + col]
}
set {
matrix[cols*row+col] = newValue
}
}
func colCount() -> Int {
return self.cols
}
func rowCount() -> Int {
return self.rows
}
}
Run Code Online (Sandbox Code Playgroud)
在代码中使用该类
func levenshtein(aStr: String, bStr: String) -> Int {
let a = Array(aStr.utf16)
let b = Array(bStr.utf16)
var dist = Array2D(cols: a.count + 1, rows: b.count + 1)
for i in 1...a.count {
dist[i, 0] = i
}
for j in 1...b.count {
dist[0, j] = j
}
for i in 1...a.count {
for j in 1...b.count {
if a[i-1] == b[j-1] {
dist[i, j] = dist[i-1, j-1] // noop
} else {
dist[i, j] = min(
dist[i-1, j] + 1, // deletion
dist[i, j-1] + 1, // insertion
dist[i-1, j-1] + 1 // substitution
)
}
}
}
return dist[a.count, b.count]
}
Run Code Online (Sandbox Code Playgroud)
测试用例中的时间下降到0.84秒.
我在Swift代码中找到的最后一个瓶颈是min()函数.Swift库具有内置min()函数,速度更快.因此,只需从Swift代码中删除自定义函数,就可以将测试用例的时间缩短到0.04秒,这几乎与Objective-C版本一样好.
附录:使用Unicode标量似乎甚至更快:
let a = Array(aStr.unicodeScalars)
let b = Array(bStr.unicodeScalars)
Run Code Online (Sandbox Code Playgroud)
并且具有与Emojis等代理对一起正常工作的优点.