是否可以在Rust中表示高阶抽象语法?

Mai*_*tor 33 haskell functional-programming rust

在Haskell中,使用函数编写代数数据类型(ADT)非常容易.这允许我们编写依赖于本机函数进行替换的解释器,即高阶抽象语法(HOAS),其已知非常有效.例如,这是一个使用该技术的简单λ演算解释器:

data Term
  = Hol Term
  | Var Int
  | Lam (Term -> Term)
  | App Term Term

pretty :: Term -> String
pretty = go 0 where
  go lvl term = case term of
    Hol hol     -> go lvl hol 
    Var idx     -> "x" ++ show idx
    Lam bod     -> "?x" ++ show lvl ++ ". " ++ go (lvl+1) (bod (Hol (Var lvl)))
    App fun arg -> "(" ++ go lvl fun ++ " " ++ go lvl arg ++ ")"

reduce :: Term -> Term
reduce (Hol hol)     = hol
reduce (Var idx)     = Var idx
reduce (Lam bod)     = Lam (\v -> reduce (bod v))
reduce (App fun arg) = case reduce fun of
  Hol fhol      -> App (Hol fhol) (reduce arg)
  Var fidx      -> App (Var fidx) (reduce arg)
  Lam fbod      -> fbod (reduce arg)
  App ffun farg -> App (App ffun farg) (reduce arg)

main :: IO ()
main
  = putStrLn . pretty . reduce
  $ App
    (Lam$ \x -> App x x)
    (Lam$ \s -> Lam$ \z -> App s (App s (App s z)))
Run Code Online (Sandbox Code Playgroud)

注意如何使用本机函数而不是de Bruijn索引.这使得解释器比我们手动替换应用程序时要快得多.

我知道Rust有闭包和许多Fn()类型,但我不确定它们在这种情况下是否与Haskell闭包完全一样,更不用说如何表达该程序,因为Rust的低级特性.是否有可能在Rust中代表HOAS?如何Term表示数据类型?

lje*_*drz 26

作为lambda演算的粉丝,我决定尝试这个并且它确实可能,虽然比Haskell(游乐场链接)稍微不那么明显:

use std::rc::Rc;
use Term::*;

#[derive(Clone)]
enum Term {
    Hol(Box<Term>),
    Var(usize),
    Lam(Rc<dyn Fn(Term) -> Term>),
    App(Box<Term>, Box<Term>),
}

impl Term {
    fn app(t1: Term, t2: Term) -> Self {
        App(Box::new(t1), Box::new(t2))
    }

    fn lam<F: Fn(Term) -> Term + 'static>(f: F) -> Self {
        Lam(Rc::new(f))
    }

    fn hol(t: Term) -> Self {
        Hol(Box::new(t))
    }
}

fn pretty(term: Term) -> String {
    fn go(lvl: usize, term: Term) -> String {
        match term {
            Hol(hol) => go(lvl, *hol),
            Var(idx) => format!("x{}", idx),
            Lam(bod) => format!("?x{}. {}", lvl, go(lvl + 1, bod(Term::hol(Var(lvl))))),
            App(fun, arg) => format!("({} {})", go(lvl, *fun), go(lvl, *arg)),
        }
    }

    go(0, term)
}

fn reduce(term: Term) -> Term {
    match term {
        Hol(hol) => *hol,
        Var(idx) => Var(idx),
        Lam(bod) => Term::lam(move |v| reduce(bod(v))),
        App(fun, arg) => match reduce(*fun) {
            Hol(fhol) => Term::app(Hol(fhol), reduce(*arg)),
            Var(fidx) => Term::app(Var(fidx), reduce(*arg)),
            Lam(fbod) => fbod(reduce(*arg)),
            App(ffun, farg) => Term::app(Term::app(*ffun, *farg), reduce(*arg)),
        },
    }
}

fn main() {
    // (?x. x x) (?s. ?z. s (s (s z)))
    let term1 = Term::app(
        Term::lam(|x| Term::app(x.clone(), x.clone())), 
        Term::lam(|s| Term::lam(move |z| 
            Term::app(
                s.clone(),
                Term::app(
                    s.clone(),
                    Term::app(
                        s.clone(),
                        z.clone()
    ))))));

    // ?b. ?t. ?f. b t f
    let term2 = Term::lam(|b| Term::lam(move |t| 
        Term::lam({
            let b = b.clone(); // necessary to satisfy the borrow checker
            move |f| Term::app(Term::app(b.clone(), t.clone()), f)
        })
    ));

    println!("{}", pretty(reduce(term1))); // ?x0. ?x1. (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 x1)))))))))))))))))))))))))))
    println!("{}", pretty(reduce(term2))); // ?x0. ?x1. ?x2. ((x0 x1) x2)
}
Run Code Online (Sandbox Code Playgroud)

由于BurntSushi5的建议,使用Rc的是我总是忘了存在和Shepmaster的建议,删除不需要BoxRcLam如何满足较长借检查Lam链.

  • 您可以使用引用计数来满足`Clone`:http://play.rust-lang.org/?gist = 7419c2dcf3d599dd4d39af6a6e83c996&version=stable&mode=debug&edition=2015 ---我怀疑这段代码可以通过帮助清理一下一些方便的构造函数,而不是直接在任何地方使用值构造函数. (4认同)

Ear*_*ine 5

接受的解决方案使用Rc创建一个可克隆堆分配关闭.

从技术上讲,这不是必需的,因为不需要运行时引用计数.我们所需要的只是一个闭包作为特征对象,它也是可克隆的.

但是,Rust 1.29.2不允许我们这样的事情dyn Clone + FnOnce(Term) -> Term,这种限制将来可能会放松.该限制有两个因素:Clone不反对安全(这是不太可能放松),如果我们结合两种特质在一起,其中一人必须是一个自动特征(这可以放宽恕我直言)

在等待语言改进的同时,我们可以引入一个新的特性来解决这个问题:

// Combination of FnOnce(Term) -> Term and Clone
trait TermLam {
    // The FnOnce part, declared like an Fn, because we need object safety
    fn app(&self, t: Term) -> Term;
    // The Clone part, but we have to return sized objects 
    // (not Self either because of object safety), so it is in a box
    fn clone_box(&self) -> Box<dyn TermLam>;
}

// Blanket implementation for appropriate types
impl<F> TermLam for F
where
    F: 'static/*' highlighting fix */ + Clone + FnOnce(Term) -> Term
{
    // Note: when you have a Clone + FnOnce, you effectively have an Fn
    fn app(&self, t: Term) -> Term {
        (self.clone())(t)
    }

    fn clone_box(&self) -> Box<dyn TermLam> {
        Box::new(self.clone())
    }
}

// We can now clone the box
impl Clone for Box<dyn TermLam> {
    fn clone(&self) -> Self {
        self.clone_box()
    }
}
Run Code Online (Sandbox Code Playgroud)

然后我们可以删除使用的需要Rc.

#[derive(Clone)]
enum Term {
    Hol(Box<Term>),
    Var(usize),
    Lam(Box<dyn TermLam>),
    App(Box<Term>, Box<Term>),
}

impl Term {
    fn app(t1: Term, t2: Term) -> Self {
        App(Box::new(t1), Box::new(t2))
    }

    fn lam<F>(f: F) -> Self
    where
       F: 'static/*' highlighting fix */ + Clone + FnOnce(Term) -> Term        
    {
        Lam(Box::new(f))
    }

    fn hol(t: Term) -> Self {
        Hol(Box::new(t))
    }
}

fn pretty(term: Term) -> String {
    fn go(lvl: usize, term: Term) -> String {
        match term {
            Hol(hol) => go(lvl, *hol),
            Var(idx) => format!("x{}", idx),
            Lam(bod) => format!("?x{}. {}", lvl, go(lvl + 1, bod.app(Term::hol(Var(lvl))))),
            App(fun, arg) => format!("({} {})", go(lvl, *fun), go(lvl, *arg)),
        }
    }

    go(0, term)
}

fn reduce(term: Term) -> Term {
    match term {
        Hol(hol) => *hol,
        Var(idx) => Var(idx),
        Lam(bod) => Term::lam(move |v| reduce(bod.app(v))),
        App(fun, arg) => match reduce(*fun) {
            Hol(fhol) => Term::app(Hol(fhol), reduce(*arg)),
            Var(fidx) => Term::app(Var(fidx), reduce(*arg)),
            Lam(fbod) => fbod.app(reduce(*arg)),
            App(ffun, farg) => Term::app(Term::app(*ffun, *farg), reduce(*arg)),
        },
    }
}

fn main() {
    // (?x. x x) (?s. ?z. s (s (s z)))
    let term1 = Term::app(
        Term::lam(|x| Term::app(x.clone(), x.clone())),
        Term::lam(|s| {
            Term::lam(move |z| {
                Term::app(
                    s.clone(),
                    Term::app(s.clone(), Term::app(s.clone(), z.clone())),
                )
            })
        }),
    );

    // ?b. ?t. ?f. b t f
    let term2 = Term::lam(|b| {
        Term::lam(move |t| {
            Term::lam({
                //let b = b.clone(); No longer necessary for Rust 1.29.2
                move |f| Term::app(Term::app(b.clone(), t.clone()), f)
            })
        })
    });

    println!("{}", pretty(reduce(term1))); // ?x0. ?x1. (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 x1)))))))))))))))))))))))))))
    println!("{}", pretty(reduce(term2))); // ?x0. ?x1. ?x2. ((x0 x1) x2)
}
Run Code Online (Sandbox Code Playgroud)

这是另一个答案尝试的原始方式,作者无法解决.

优化!

已知Rust在不牺牲安全性的情况下实现性能.然而,上面的实现总是Term按值传递并且有许多不必要的clone调用,因此可以进行一些优化.

另外,对一段Rust数据进行字符串化的标准方法是使用该Display特征.所以让我们做对了!

use std::fmt::{Display, Error, Formatter};
use Term::*;

// Combination of FnOnce(Term) -> Term and Clone
trait TermLam {
    // The FnOnce part, declared like an Fn, because we need object safety
    fn app(&self, t: Term) -> Term;
    // The Clone part, but we have to return sized objects
    // (not Self either because of object safety), so it is in a box
    fn clone_box(&self) -> Box<dyn TermLam>;
}

// Blanket implementation for appropriate types
impl<F> TermLam for F
where
    F: 'static/*' highlighting fix */ + Clone + FnOnce(Term) -> Term,
{
    // Note: when you have a Clone + FnOnce, you effectively have an Fn
    fn app(&self, t: Term) -> Term {
        (self.clone())(t)
    }

    fn clone_box(&self) -> Box<dyn TermLam> {
        Box::new(self.clone())
    }
}

// We can now clone the box
impl Clone for Box<dyn TermLam> {
    fn clone(&self) -> Self {
        self.clone_box()
    }
}

#[derive(Clone)]
enum Term {
    Hol(Box<Term>),
    Var(usize),
    Lam(Box<dyn TermLam>),
    App(Box<Term>, Box<Term>),
}

impl Term {
    fn app(t1: Term, t2: Term) -> Self {
        App(Box::new(t1), Box::new(t2))
    }

    fn lam<F>(f: F) -> Self
    where
        F: 'static/*' highlighting fix */ + Clone + FnOnce(Term) -> Term,
    {
        Lam(Box::new(f))
    }

    fn hol(t: Term) -> Self {
        Hol(Box::new(t))
    }

    // `reduce` is now a by-reference method
    fn reduce(&self) -> Term {
        match self {
            Hol(_) => self.clone(),
            Var(_) => self.clone(),
            Lam(bod) => {
                let bod = bod.clone();
                Term::lam(move |v| bod.app(v).reduce())
            },
            // We reuse the reduced object when possible,
            // to avoid unnecessary clone.
            App(fun, arg) => match fun.reduce() {
                other @ Hol(_) => Term::app(other, arg.reduce()),
                other @ Var(_) => Term::app(other, arg.reduce()),
                Lam(fbod) => fbod.app(arg.reduce()),
                other @ App(_, _) => Term::app(other, arg.reduce()),
            },
        }
    }
}
//The standard way of `pretty` is `Display`
impl Display for Term {
    fn fmt(&self, fmt: &mut Formatter) -> Result<(), Error> {
        // As the API is different from `pretty`, the way we do recursion is
        // a bit different as well
        struct LvlTerm<'a>(usize, &'a Term);
        impl<'a> Display for LvlTerm<'a> {
            fn fmt(&self, fmt: &mut Formatter) -> Result<(), Error> {
                match self {
                    LvlTerm(lvl, Hol(hol)) => write!(fmt, "{}", LvlTerm(*lvl, hol)),
                    LvlTerm(_, Var(idx)) => write!(fmt, "x{}", idx),
                    LvlTerm(lvl, Lam(bod)) => write!(
                        fmt,
                        "?x{}. {}",
                        *lvl,
                        LvlTerm(*lvl + 1, &bod.app(Term::hol(Var(*lvl))))
                    ),
                    LvlTerm(lvl, App(fun, arg)) => {
                        write!(fmt, "({} {})", LvlTerm(*lvl, fun), LvlTerm(*lvl, arg))
                    }
                }
            }
        }
        write!(fmt, "{}", LvlTerm(0, self))
    }
}

fn main() {
    // In general, if you need to use a value n+1 times, you need to
    // call clone it n times. You don't have to clone it in the last use.
    // (?x. x x) (?s. ?z. s (s (s z)))
    let term1 = Term::app(
        Term::lam(|x| Term::app(x.clone(), x)),
        Term::lam(|s| {
            Term::lam(move |z| Term::app(s.clone(), Term::app(s.clone(), Term::app(s, z))))
        }),
    );

    // No clone is required if all values are used exactly once.
    // ?b. ?t. ?f. b t f
    let term2 =
        Term::lam(|b| Term::lam(move |t| Term::lam(move |f| Term::app(Term::app(b, t), f))));

    println!("{}", term1.reduce()); // ?x0. ?x1. (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 (x0 x1)))))))))))))))))))))))))))
    println!("{}", term2.reduce()); // ?x0. ?x1. ?x2. ((x0 x1) x2)
}
Run Code Online (Sandbox Code Playgroud)

操场

我们可以看到,上面的代码可以进一步简化:由于匹配臂reduce有代码重复,我们可以将它们一起折叠.但是,由于这个答案的目的是演示问题中Haskell代码的SAME方式,所以这就是原样.

此外,通过仅要求关闭FnOnce,不是Fn,我们在使用站点放宽了将所有变量克隆到仅多次使用时的要求.权衡的是,每当调用闭包时,它捕获的所有变量都将被克隆.在对其进行剖析之前,很难说哪一个更好; 所以我只选择使代码看起来更好的那个.

同样,Rust会明确地做出这些决定,并在您的心中保留选择的差异,这很好!