调试最长公共子序列算法的性能瓶颈

Question

调试最长公共子序列算法的性能瓶颈

我正在使用向量库和状态monad在Haskell中编写一个最长的公共子序列算法(以封装Miller O(NP)算法的非常强制性和可变性).我已经用C语言编写了一些我需要它的项目,现在我正在用Haskell编写它作为一种探索如何编写具有匹配C的良好性能的命令式网格遍历算法的方法.我用unboxed向量编写的版本对于相同的输入,它比C版本慢大约4倍(并且使用正确的优化标志编译 - 我同时使用了系统时钟和Criterion验证Haskell和C版本之间相对时间测量的方法,以及相同的数据类型,包括大输入和小输入).我一直试图弄清楚性能问题可能在哪里,并会欣赏反馈 - 有可能在这里遇到一些众所周知的性能问题,特别是在我在这里大量使用的矢量库中.

在我的代码中,我有一个名为gridWalk的函数,它最常被调用,并且还完成了大部分工作.性能下降很可能会出现,但我无法弄清楚它可能是什么.完整的Haskell代码就在这里.以下代码的片段:

import Data.Vector.Unboxed.Mutable as MU
import Data.Vector.Unboxed as U hiding (mapM_)
import Control.Monad.ST as ST
import Control.Monad.Primitive (PrimState)
import Control.Monad (when) 
import Data.STRef (newSTRef, modifySTRef, readSTRef)
import Data.Int


type MVI1 s  = MVector (PrimState (ST s)) Int

cmp :: U.Vector Int32 -> U.Vector Int32 -> Int -> Int -> Int
cmp a b i j = go 0 i j
               where
                 n = U.length a
                 m = U.length b
                 go !len !i !j| (i<n) && (j<m) && ((unsafeIndex a i) == (unsafeIndex b j)) = go (len+1) (i+1) (j+1)
                                    | otherwise = len

-- function to find previous y on diagonal k for furthest point 
findYP :: MVI1 s -> Int -> Int -> ST s (Int,Int)
findYP fp k offset = do
              let k0 = k+offset-1
                  k1 = k+offset+1
              y0 <- MU.unsafeRead fp k0 >>= \x -> return $ 1+x
              y1 <- MU.unsafeRead fp k1
              if y0 > y1 then return (k0,y0)
              else return (k1,y1)
{-#INLINE findYP #-}

gridWalk :: Vector Int32 -> Vector Int32 -> MVI1 s -> Int -> (Vector Int32 -> Vector Int32 -> Int -> Int -> Int) -> ST s ()
gridWalk a b fp !k cmp = {-#SCC gridWalk #-} do
   let !offset = 1+U.length a
   (!kp,!yp) <- {-#SCC findYP #-} findYP fp k offset                          
   let xp = yp-k
       len = {-#SCC cmp #-} cmp a b xp yp
       x = xp+len
       y = yp+len

   {-#SCC "updateFP" #-} MU.unsafeWrite fp (k+offset) y  
   return ()
{-#INLINE gridWalk #-}

-- The function below executes ct times, and updates furthest point as they are found during furthest point search
findSnakes :: Vector Int32 -> Vector Int32 -> MVI1 s ->  Int -> Int -> (Vector Int32 -> Vector Int32 -> Int -> Int -> Int) -> (Int -> Int -> Int) -> ST s ()
findSnakes a b fp !k !ct cmp op = {-#SCC findSnakes #-} U.forM_ (U.fromList [0..ct-1]) (\x -> gridWalk a b fp (op k x) cmp)
{-#INLINE findSnakes #-}

归档时间：	12 年，7 月前
查看次数：	673 次
最近记录：	12 年，7 月前