加速此功能有哪些可能性?

Mai*_*tor 6 algorithm tree performance haskell

我想加快以下功能:

{-# LANGUAGE BangPatterns #-}

import Data.Word
import Data.Bits
import Data.List (foldl1')
import System.Random
import qualified Data.List as L

data Tree a = AB (Tree a) (Tree a) | A (Tree a) | B (Tree a) | C !a
    deriving Show

merge :: Tree a -> Tree a -> Tree a
merge (C x) _               = C x
merge _ (C y)               = C y
merge (A ta) (A tb)         = A (merge ta tb)
merge (A ta) (B tb)         = AB ta tb
merge (A ta) (AB tb tc)     = AB (merge ta tb) tc
merge (B ta) (A tb)         = AB tb ta
merge (B ta) (B tb)         = B (merge ta tb)
merge (B ta) (AB tb tc)     = AB tb (merge ta tc)
merge (AB ta tb) (A tc)     = AB (merge ta tc) tb
merge (AB ta tb) (B tc)     = AB ta (merge tb tc)
merge (AB ta tb) (AB tc td) = AB (merge ta tc) (merge tb td)
Run Code Online (Sandbox Code Playgroud)

为了强调其性能,我使用merge以下方法实现了排序:

fold ab a b c list = go list where
    go (AB a' b') = ab (go a') (go b')
    go (A a')     = a (go a')
    go (B b')     = b (go b')
    go (C x)      = c x

mergeAll :: [Tree a] -> Tree a
mergeAll = foldl1' merge

foldrBits :: (Word32 -> t -> t) -> t -> Word32 -> t
foldrBits cons nil word = go 32 word nil where
    go 0 w !r = r
    go l w !r = go (l-1) (shiftR w 1) (cons (w.&.1) r)

word32ToTree :: Word32 -> Tree Word32
word32ToTree w = foldrBits cons (C w) w where
    cons 0 t = A t
    cons 1 t = B t

toList = fold (++) id id (\ a -> [a])

sort = toList . mergeAll . map word32ToTree

main = do
    is <- mapM (const randomIO :: a -> IO Word32) [0..500000]
    print $ sum $ sort is
Run Code Online (Sandbox Code Playgroud)

性能提出了从去相当不错,2.5倍左右慢Data.Listsort.我并进一步加快,高达,虽然没有什么:内联几个功能,砰在很多地方注释,UNPACKC !a.有没有办法加快这个功能?

Yur*_*ras 8

你肯定分配了太多的thunk.我将展示如何分析代码:

merge (A ta) (A tb)         = A (merge ta tb)
Run Code Online (Sandbox Code Playgroud)

在这里你A用一个参数分配构造函数,这是一个thunk.你能说出什么时候merge ta tb会被迫?可能只在最后,当使用结果树时.尝试为每个Tree构造函数的每个参数添加一个爆炸,以确保它是严格的脊柱:

data Tree a = AB !(Tree a) !(Tree a) | A !(Tree a) | B !(Tree a) | C !a
Run Code Online (Sandbox Code Playgroud)

下一个例子:

go l w !r = go (l-1) (shiftR w 1) (cons (w.&.1) r)
Run Code Online (Sandbox Code Playgroud)

在这里你要分配一个thunk l-1,shifrR w 1cons (w.&.1) r.第一个将在下一个迭代进行比较时,被强制lo,第二个将迫使在下次迭代(三维形实转换时被迫w在此使用的),和第三形实转换被迫在下一迭代上,因为爆炸的r.所以可能这个特殊的条款还可以.

  • @Viclib我不知道任何有关脊柱严格与脊柱懒惰结构的正式研究.但我认为这主要取决于使用模式.你可以修复`merge`来强制脊椎,而不是严格限制树脊. (2认同)