在 Scheme (R6RS) 中表示代数数据类型构造函数的惯用方法是什么?

Mai*_*tor 8 scheme haskell idioms compilation algebraic-data-types

我想知道将类似 Haskell 的数据类型转换为 Scheme 的最佳方法是什么。我目前的计划是用vectors表示构造函数,第一个元素是label表示变体的 a 。因此,例如,以下 Haskell 程序:

data Bits       = O Bits | I Bits | E deriving Show
data Nat        = S Nat  | Z          deriving Show
inc (O pred)    = I pred
inc (I pred)    = O (inc pred)
inc E           = E
dup (S pred)    = let (x,y) = dup pred in (S x, S y)
dup Z           = (Z, Z)
bus Z        bs = inc bs
bus (S pred) bs = let (x,y) = (pred,pred) in (bus pred (bus pred bs))
o32             = (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O E))))))))))))))))))))))))))))))))
n26             = (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S Z))))))))))))))))))))))))))
main            = print (bus n26 o32)
Run Code Online (Sandbox Code Playgroud)

会被翻译成:

(define (O pred)   (vector 'O pred))
(define (I pred)   (vector 'I pred))
(define E          (vector 'E))
(define (S pred)   (vector 'S pred))
(define Z          (vector 'Z))
(define (Inc bits) (case (vector-ref bits 0) ('O (I (vector-ref bits 1))) ('I (O (Inc (vector-ref bits 1)))) ('E E)))
(define (Dup val)  (case (vector-ref val 0) ('S (let ((xy (Dup (vector-ref val 1)))) (cons (S (car xy)) (S (cdr xy))))) ('Z (cons Z Z))))
(define (Bus n bs) (case (vector-ref n 0) ('Z (Inc bs)) ('S (let ((xy (Dup (vector-ref n 1)))) (Bus (car xy) (Bus (cdr xy) bs))))))
(define O32        (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O (O E)))))))))))))))))))))))))))))))))
(define N26        (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S (S Z)))))))))))))))))))))))))))
(display (Bus N26 O32))
Run Code Online (Sandbox Code Playgroud)

令我惊讶的是,这实际上表现得很好(这里的Scheme 比Haskell 快)。但我想知道这是否是最好的方法?这是合理的,还是有一些更“惯用”的翻译,预计效果会更好?

Jon*_*rdy 4

概括地说,我认为你有两种方法: \xe2\x80\x9cpositive\xe2\x80\x9d 编码,就像你\xe2\x80\x99ve在这里所做的那样,其中你有向量表示包含(动态)标签的产品来区分总和,以及a \xe2\x80\x9cnegative\xe2\x80\x9d 编码(B\xc3\xb6hm\xe2\x80\x93Berarducci / Visitor),其中您通过使用它们的递归方案来表示数据类型。

\n

Haskell 中 BB 编码版本的示例:

\n
{-# Language RankNTypes #-}\n\n-- The type of /folds/ that consume bits.\nnewtype Bits = Bits\n  { matchBits\n    :: forall r.  -- Caller-specified result type.\n       (r -> r)   -- O: 1 recursive field\n    -> (r -> r)   -- I: 1 recursive field\n    -> r          -- E: no fields; could be \xe2\x80\x98() -> r\xe2\x80\x99\n    -> r\n  }\n\n-- The type of one /recursive unrolling/ of bits.\nnewtype Bits\' = Bits\'\n  { matchBits\'\n    :: forall r.    -- Also any result type.\n       (Bits -> r)  -- But note! \xe2\x80\x98Bits\xe2\x80\x99 instead of \xe2\x80\x98r\xe2\x80\x99.\n    -> (Bits -> r)\n    -> r\n    -> r\n  }\n\n-- Basic constructors retain their types.\nmkI, mkO :: Bits -> Bits\nmkE :: Bits\n\nmkI\', mkO\' :: Bits\' -> Bits\'\nmkE\' :: Bits\'\n\n-- Constructor functions perform visitor dispatch.\n-- This is where the recursion happens in \xe2\x80\x98matchBits\xe2\x80\x99.\nmkO pred = Bits $ \\  o  i e -> o (matchBits pred o i e)\nmkI pred = Bits $ \\  o  i e -> i (matchBits pred o i e)\nmkE      = Bits $ \\ _o _i e -> e\n\n-- General recursive dispatch is similar.\nmkO\' pred = Bits\' $ \\  o  i e -> o (matchBits\' pred mkO mkI mkE)\nmkI\' pred = Bits\' $ \\  o  i e -> i (matchBits\' pred mkO mkI mkE)\nmkE\'      = Bits\' $ \\ _o _i e -> e\n\n-- Recursive deconstruction, used below.\nrecurBits :: Bits -> Bits\'\nrecurBits bits = matchBits bits mkO\' mkI\' mkE\'\n\n-- We only need a fold for nats here.\nnewtype Nat = Nat\n  { matchNat\n    :: forall r.  -- Result type.\n       (r -> r)   -- S: 1 recursive field\n    -> r          -- Z: no fields; also could be \xe2\x80\x98() -> r\xe2\x80\x99\n    -> r\n  }\n\nmkS :: Nat -> Nat\nmkZ :: Nat\n\nmkS pred = Nat $ \\  s z -> s (matchNat pred s z)\nmkZ      = Nat $ \\ _s z -> z\n\n-- Case branches with \xe2\x80\x98matchBits\xe2\x80\x99 receive the /result/\n-- of the recursive call on a recursive field. So this\n-- is /not/ what we want:\n--\n-- > inc bits = matchBits bits mkI (mkO . inc) mkE\n--\n-- Instead, we want the field itself, so we must use\n-- the recursive \xe2\x80\x98matchBits\'\xe2\x80\x99.\ninc :: Bits -> Bits\ninc bits = matchBits\' (recurBits bits) mkI (mkO . inc) mkE\n\n-- Or: \xe2\x80\x98dup nat = matchNat nat (mkS *** mkS) (mkZ, mkZ)\xe2\x80\x99\n-- Or: \xe2\x80\x98dup nat = (nat, nat)\xe2\x80\x99 = \xe2\x80\x98dup = join (,)\xe2\x80\x99\ndup :: Nat -> (Nat, Nat)\ndup nat = matchNat nat\n  (\\ (x, y) -> (mkS x, mkS y))  -- S\n  (mkZ, mkZ)                    -- Z\n\n-- NB: think of as \xe2\x80\x98Nat -> (Bits -> Bits)\xe2\x80\x99.\nbus :: Nat -> Bits -> Bits\nbus n = matchNat n\n  (\\ f -> f . f)  -- S\n  inc             -- Z\n
Run Code Online (Sandbox Code Playgroud)\n

你可以或多或少地将其直接转化为Scheme。这里\xe2\x80\x99是一个未经测试且可能不正确的翻译草图,只是为了说明一个起点:

\n
(define (O pred)  (lambda (o i e) (o (pred o i e))))\n(define (I pred)  (lambda (o i e) (i (pred o i e))))\n(define E         (lambda (o i e) (e)))\n\n(define (O_ pred) (lambda (o i e) (o (pred O I E))))\n(define (I_ pred) (lambda (o i e) (i (pred O I E))))\n(define E_        (lambda (o i e) (e)))\n\n(define (S pred)  (lambda (s z) (s (pred s z))))\n(define Z         (lambda (s z) (z)))\n\n(define (recurBits bits) (bits O_ I_ E_))\n\n(define (Inc bits)\n  ((recurBits bits)\n     I\n     (lambda (pred) (O (Inc pred)))\n     E))\n\n(define (Dup val)\n  (val (lambda (p)\n         (let ((x (car p))\n               (y (cdr p)))\n           (cons (S x) (S y))))\n       (cons Z Z)))\n\n(define (Bus n bs)\n  ((Bus_ n) bs))\n(define (Bus_ n)\n  (n (lambda (pred) (lambda (bs) (pred (pred bs))))\n     inc))\n
Run Code Online (Sandbox Code Playgroud)\n

您可能需要在几个地方添加一些显式参数或附加 lambda 来处理部分应用程序和惰性求值中的差异,例如Bus_.

\n

但总的来说,我\xe2\x80\x99d 希望这种方法在许多应用程序中具有相当或更好的性能特征。它依赖于闭包而不是向量,这可能会编译得更好,因为语言实现更了解它们的结构。它不是动态地分派值,而是在函数中进行选择以进行(尾)调用,从而避免了某些值的构造。

\n

我还发现Oleg Kiselyov\xe2\x80\x99s 关于 BB 编码的笔记在学习 Haskell 中的这项技术时是一个有用的资源。

\n