如何加速我的Haskell程序(达到Python的水平)

Rad*_*ado 15 haskell

我有以下玩具程序循环移动矢量并将其添加到自身(在mod下).它针对不同的移位和大量迭代(与向量的大小相比)来做到这一点.程序有效,但它的狗很慢.我还在学习Haskell,所以我的问题是:我做错了吗?

import Data.List (foldl')
import qualified Data.Sequence as Seq
import Data.Sequence (index, zipWith, Seq, (><), (<|), (|>))

seqSize = 100
numShifts = 10000 

cycleShift :: Integer -> Seq a -> Seq a
cycleShift s l = Seq.drop (fromInteger s) l >< Seq.take (fromInteger s) l

modAdd :: Seq Integer -> Seq Integer -> Seq Integer 
modAdd s t = Seq.zipWith (\ a b -> (a + b) `mod` 10^16) s t

step :: Seq Integer -> Integer -> Seq Integer
step l shift = modAdd l (cycleShift shift l)

allshifts = [i `mod` seqSize |i <- [1..numShifts]]
start = Seq.fromList (1 : [0 | i <- [1..(seqSize - 1)]])
end = foldl' step start allshifts

main :: IO ()
main = print (Seq.index end 0)
Run Code Online (Sandbox Code Playgroud)

Python中的相同程序

seq_size = 100 
num_shifts = 10000

S = [i % seq_size for i in xrange(1, num_shifts + 1)]
ssums = [1] + [0 for i in range(seq_size - 1)]

for s in S: 
    shift = ssums[s:] + ssums[:s]  
    ssums = [(ssums[i] + shift[i]) % 10**16 for i in range(seq_size)]  

print ssums[0]
Run Code Online (Sandbox Code Playgroud)

这是时间安排.Haskell:真正的0m5.596s Python:真正的0m0.551s

Python并不以它的速度而闻名,但却快了x10倍?!?

por*_*ges 20

你是怎么跑的?

我为Haskell版本获得1.6秒.(编译ghc.exe -O2 seq.hs.)

另外,你有没有理由使用Seq?如果我将其更改为使用列表,我将获得0.3秒的执行时间.

这是列表:

import Data.List (foldl')

seqSize = 100
numShifts = 10000 

cycleShift s l = drop (fromInteger s) l ++ take (fromInteger s) l

modAdd s t = zipWith (\ a b -> (a + b) `mod` 10^16) s t

step l shift = modAdd l (cycleShift shift l)

allshifts = [i `mod` seqSize |i <- [1..numShifts]]
start = (1 : [0 | i <- [1..(seqSize - 1)]])
end = foldl' step start allshifts

main :: IO ()
main = print (end !! 0)
Run Code Online (Sandbox Code Playgroud)


fuz*_*fuz 18

  1. 使用普通列表.它们经过了大量优化.使用Data.Vector速度更快.
  2. rem而不是mod
  3. 避免不必要的工作 (参见cycleShift.之前,你将列表拆分了两次)
  4. 如果您的计算不超过界限,请使用Int而不是Integer.前者是硬件int,后者是任意精度,但是通过软件模拟.

结果:3.6秒至0.5秒.更多可能是可能的.

码:

import Data.List (foldl')
import Data.Tuple

seqSize, numShifts :: Int
seqSize = 100

numShifts = 10000 

cycleShift :: Int -> [a] -> [a]
cycleShift s = uncurry (++) . swap . splitAt s

modAdd :: [Int] -> [Int] -> [Int]
modAdd = zipWith (\ a b -> (a + b) `rem` 10^16)

step :: [Int] -> Int -> [Int]
step l shift = modAdd l (cycleShift shift l)

allshifts = map (`rem` seqSize) [1..numShifts]
start = 1 : replicate (seqSize - 1) 0
end = foldl' step start allshifts

main :: IO ()
main = print (head end)
Run Code Online (Sandbox Code Playgroud)

编辑

使用它会变得更快Data.Vector.我使用此代码在我的机器上大约0.4秒:

import Data.List (foldl')
import Data.Tuple

import Data.Vector (Vector)
import qualified Data.Vector as V

seqSize, numShifts :: Int
seqSize = 100

numShifts = 10000 

cycleShift :: Int -> Vector a -> Vector a
cycleShift s = uncurry (V.++) . swap . V.splitAt s

modAdd :: Vector Int -> Vector Int -> Vector Int
modAdd = V.zipWith (\ a b -> (a + b) `rem` 10^16)

step :: Vector Int -> Int -> Vector Int
step l shift = modAdd l (cycleShift shift l)

allshifts = map (`rem` seqSize) [1..numShifts]
start = 1 `V.cons` V.replicate (seqSize - 1) 0
end = foldl' step start allshifts

main :: IO ()
main = print (V.head end)
Run Code Online (Sandbox Code Playgroud)

编辑2

使用Data.Vector.Unboxed(只需更改导入并修复签名),运行时下降到0.074秒.但结果是正确的,如果Int有64位.它也可能是快速使用Int64.