在Haskell中使用列表生成器来实现内存高效的代码

Hel*_*lix 4 performance haskell list

我想得到一个编写内存有效的haskell代码的句柄.我碰到的一件事是,没有简单的方法来制作python样式列表生成器/迭代器(我能找到).

小例子:

在不使用闭合形式公式的情况下查找1到100000000之间的整数之和.

Python可以通过最少的内存使用来快速完成sum(xrange(100000000).在Haskell中,模拟将是sum [1..100000000].然而,这耗尽了大量内存.我认为使用foldl或者foldr会很好,但即使使用大量内存并且比python慢​​.有什么建议?

eps*_*lbe 6

TL; DR - 我认为在这种情况下的罪魁祸首是 - 将GHC违约Integer.

不可否认我对python知之甚少,但我的第一个猜测是python只在必要时切换到"bigint" - 因此所有计算都是Int在我的机器上使用64位整数完成的.

第一次检查

$> ghci
GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
Prelude> maxBound :: Int
9223372036854775807
Run Code Online (Sandbox Code Playgroud)

揭示sum(5000000050000000)的结果小于那个数字,所以我们不用担心Int溢出.

我猜你的示例程序大致看起来像这样

sum.py

print(sum(xrange(100000000)))
Run Code Online (Sandbox Code Playgroud)

sum.hs

main :: IO ()
main = print $ sum [1..100000000]
Run Code Online (Sandbox Code Playgroud)

为了使事情明确,我添加了类型注释(100000000 :: Integer),使用它进行编译

$ > stack build --ghc-options="-O2 -with-rtsopts=-sstderr"
Run Code Online (Sandbox Code Playgroud)

并运行你的榜样,

$ > stack exec -- time sum
5000000050000000
   3,200,051,872 bytes allocated in the heap
         208,896 bytes copied during GC
          44,312 bytes maximum residency (2 sample(s))
          21,224 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      6102 colls,     0 par    0.013s   0.012s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    1.725s  (  1.724s elapsed)
  GC      time    0.013s  (  0.012s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    1.739s  (  1.736s elapsed)

  %GC     time       0.7%  (0.7% elapsed)

  Alloc rate    1,855,603,449 bytes per MUT second

  Productivity  99.3% of total user, 99.4% of total elapsed

1.72user 0.00system 0:01.73elapsed 99%CPU (0avgtext+0avgdata 4112maxresident)k
Run Code Online (Sandbox Code Playgroud)

确实再现了~3GB的内存消耗.

将注释更改为(100000000 :: Int)- 彻底改变了行为

$ > stack build
$ > stack exec -- time sum
5000000050000000
          51,872 bytes allocated in the heap
           3,408 bytes copied during GC
          44,312 bytes maximum residency (1 sample(s))
          17,128 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0         0 colls,     0 par    0.000s   0.000s     0.0000s    0.0000s
  Gen  1         1 colls,     0 par    0.000s   0.000s     0.0001s    0.0001s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.034s  (  0.034s elapsed)
  GC      time    0.000s  (  0.000s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.036s  (  0.035s elapsed)

  %GC     time       0.2%  (0.2% elapsed)

  Alloc rate    1,514,680 bytes per MUT second

  Productivity  99.4% of total user, 102.3% of total elapsed

0.03user 0.00system 0:00.03elapsed 91%CPU (0avgtext+0avgdata 3496maxresident)k
0inputs+0outputs (0major+176minor)pagefaults 0swaps
Run Code Online (Sandbox Code Playgroud)

对于感兴趣的

如果您使用像(conduitvector盒装和未装箱)这样的库,则haskell版本的行为不会发生很大变化.

示例程序

sumC.hs

import Data.Conduit
import Data.Conduit.List as CL

main :: IO ()
main = do res <- CL.enumFromTo 1 100000000 $$ CL.fold (+) (0 :: Int)
          print res
Run Code Online (Sandbox Code Playgroud)

sumV.hs

import           Data.Vector.Unboxed as V
{-import           Data.Vector as V-}

main :: IO ()
main = print $ V.sum $ V.enumFromTo (1::Int) 100000000
Run Code Online (Sandbox Code Playgroud)

有趣的版本使用

main = print $ V.sum $ V.enumFromN (1::Int) 100000000
Run Code Online (Sandbox Code Playgroud)

比上面更糟糕 - 尽管文档说不然.

enumFromN :: (Unbox a, Num a) => a -> Int -> Vector a
Run Code Online (Sandbox Code Playgroud)

O(n)产生给定长度的向量,包含值x,x + 1等.此操作通常比enumFromTo更有效.

更新

@ Carsten的评论让我好奇 - 所以我看了整数的来源 - integer-simple确切地说,因为Integer有其他版本integer-gmpinteger-gmp2使用libgmp.

data Integer = Positive !Positive | Negative !Positive | Naught

-------------------------------------------------------------------
-- The hard work is done on positive numbers

-- Least significant bit is first

-- Positive's have the property that they contain at least one Bit,
-- and their last Bit is One.
type Positive = Digits
type Positives = List Positive

data Digits = Some !Digit !Digits
            | None
type Digit = Word#

data List a = Nil | Cons a (List a)
Run Code Online (Sandbox Code Playgroud)

因此,当使用时Integer,相比于Int或者更确切地说是未装箱,会有相当多的内存开销Int#- 我想这应该是优化的,(尽管我还没有确认).

所以Integer(如果我正确计算)

  • Wordsum-type-tag为1 x (此处为Positive
  • nx(Word + Word)SomeDigit部分
  • Word最后1 xNone

在该计算中每个(2 + floor(log_10(n))的内存开销+累加器的内存开销Integer.