为什么`data`导致无限循环而`newtype`不是

Question

为什么`data`导致无限循环而`newtype`不是

我正在学习带有箭头Arrow的教程编程。我已经根据论文输入了以下代码，除了的SF定义是由data，而不是论文中所定义的newtype（实际上，由于我是从内存中键入代码的，所以我偶然地做了此更改）：

import Control.Category
import Control.Arrow
import Prelude hiding (id, (.))

data SF a b = SF { runSF :: [a] -> [b] }  -- this is the change, using data instead of newtype as in the paper 

-- The folowing code is the same as in the paper
instance Category SF where
  id = SF $ \x -> x
  (SF f) . (SF g) = SF $ \x -> f (g x)

instance Arrow SF where
  arr f = SF $ map f
  first (SF f) = SF $ unzip >>> first f >>> uncurry zip

instance ArrowChoice SF where
  left (SF f) = SF $ \xs -> combine xs (f [y | Left y <- xs])
    where
      combine (Left _ : ys) (z:zs) = Left z : combine ys zs
      combine (Right y : ys) zs = Right y : combine ys zs
      combine [] _ = []

delay :: a -> SF a a
delay x = SF $ init . (x:)

mapA :: ArrowChoice a => a b c -> a [b] [c]
mapA f = arr listcase >>>
         arr (const []) ||| (f *** mapA f >>> arr (uncurry (:)))

listcase :: [a] -> Either () (a, [a])
listcase [] = Left ()
listcase (x:xs) = Right (x, xs)

Run Code Online (Sandbox Code Playgroud)

当我加载文件ghci并执行时runSF (mapA (delay 0)) [[1,2,3],[4,5,6]]，它会触发无限循环并最终耗尽内存。如果我改data回newtype，一切正常。ghc 8.0.2、8.2.2和8.6.3也会发生相同的问题。

即使我将代码编译成可执行文件，也存在相同的问题。

我已经想到的差之间data和newtype，只有一个字段定义数据结构时，是运行时成本。但是这个问题似乎暗示着它们之间的更多区别。也许有些关于Arrow类型类的事情我还没有注意到。

谁能有任何想法？非常感谢！

Answer 1

luq*_*qui 12

Let's look at this example.

data A = A [Int]
    deriving (Show)

cons :: Int -> A -> A
cons x (A xs) = A (x:xs)

ones :: A
ones = cons 1 ones

Run Code Online (Sandbox Code Playgroud)

We would expect that ones should be A [1,1,1,1...], because all we have done is wrap a list in a data constructor. But we would be wrong. Recall that pattern matches are strict for data constructors. That is, cons 1 undefined = undefined rather than A (1 : undefined). So when we try to evaluate ones, cons pattern matches on its second argument, which causes us to evaluate ones... we have a problem.

newtypes don't do this. At runtime newtype constructors are invisible, so it's as if we had written the equivalent program on plain lists

cons :: Int -> [Int] -> [Int]
cons x ys = x:ys

ones = cons 1 ones

Run Code Online (Sandbox Code Playgroud)

which is perfectly productive, since when we try to evaluate ones, there is a : constructor between us and the next evaluation of ones.

You can get back the newtype semantics by making your data constructor pattern matches lazy:

cons x ~(A xs) = A (x:xs)

Run Code Online (Sandbox Code Playgroud)

This is the problem with your code (I have run into this exact problem doing this exact thing). There are a few reasons data pattern matches are strict by default; the most compelling one I see is that pattern matching would otherwise be impossible if the type had more than one constructor. There is also a small runtime overhead to lazy pattern matching in order to fix some subtle GC leaks; details linked in the comments.

归档时间：	6 年，5 月前
查看次数：	345 次
最近记录：	6 年，5 月前