如何在Haskell中表示两棵树之间的映射？

Question

如何在Haskell中表示两棵树之间的映射？

Gar*_*ell 13 mapping algorithm tree haskell abstract-syntax-tree

我正在尝试在Haskell中实现树处理算法，并且（由于这是我的第一个Haskell程序！）正在为数据结构的设计而苦苦挣扎。那里的任何FP专家都可以伸出援手吗？

首先，我将描述算法的重要特征，勾勒出如何使用命令式语言来实现这一点，最后完成到目前为止在Haskell中遇到的绊脚石。

问题

我不会详细描述完整的算法，但是要点如下：

该算法在两个玫瑰树X和Y上运行。
该算法的第一阶段基于其节点和后代的标签和属性，为每个节点计算一些派生属性。
这些派生的属性用于计算两棵树之间的部分映射，以使X中的节点可以与Y中的节点关联，反之亦然。由于映射是局部的，因此X或Y中的任何节点都可以被映射（即，另一棵树有一个伙伴），或者可以不映射。
该算法的最后阶段通过检查被映射节点的父/子/兄弟姐妹的一系列操作来优化这些映射。

因此，数据结构必须具有以下特征：

给定对节点的引用，请提供对该节点的父级，该节点的同级以及该节点的子级的访问。
给定输入树中的节点，允许使用附加信息（派生的属性以及对另一棵树中节点的可选引用）对该节点进行注释。

命令性解决方案的草图

如果我要使用命令式语言实现此算法，则解决方案将类似于以下内容。

假设起点是输入树的以下定义：

struct node {
    // Identifier for this node, unique within the containing tree
    size_t id;

    // Label of this node
    enum label label;

    // Attributes of this node
    // An attribute can be assumed to be a key-value pair
    // Details of the attributes themselves aren't material to this
    // discussion, so the "attribute" type is left opaque
    struct attribute **attributes;
    size_t n_attributes;

    // Pointer to parent of this node
    // NULL iff this node is root
    struct node *parent;

    // Pointer to first child of this node
    // NULL iff this node is leaf
    struct node *child;

    // Double-linked list of siblings of this node
    struct node *prev;
    struct node *next;
};

Run Code Online (Sandbox Code Playgroud)

嵌入在每个节点中的指针显然支持算法所需的上/下/左/右遍历。

可以通过定义以下结构来实现注释：

struct algo_node {
    // Pointer to input node which has been wrapped
    struct node *node;

    // Derived properties computed by first phase of the algorithm
    // Details of the properties themselves aren't material to this
    // discussion, so the "derived" type is left opaque
    struct derived props;

    // Pointer to corresponding node in the other tree
    // NULL iff this node is unmatched
    struct node *match;
};

Run Code Online (Sandbox Code Playgroud)

该算法的第一阶段algo_node为每个node输入树构造一个for 。

从algo_node到的映射node很简单：遵循嵌入式*node指针。通过将algo_nodes 存储在由id输入节点的索引的数组中，可以支持向其他方向的映射。

当然，这只是一种可能的实现。可能有许多变化，包括

在listor queue界面后面提取子链表，而不是存储三个原始指针
与其通过索引将输入树与算法树相关联，不如直接将父/子/兄弟关系编码为 struct algo_node

搬到Haskell

让我们从输入树的以下定义开始：

data Tree = Leaf Label Attributes
          | Node Label Attributes [Tree]

Run Code Online (Sandbox Code Playgroud)

每个具有id的节点的增强可以通过以下方式实现：

data AnnotatedTree = Tree Int

addIndex :: Int -> AnnotatedTree -> (AnnotatedTree, Int)

indexedTree = addIndex 0 tree

Run Code Online (Sandbox Code Playgroud)

同样，我们可以编写一个函数来计算派生的属性：

data AnnotatedTree = Tree DerivedProperties

computeDerived :: DerivedProperties -> AnnotatedTree -> (AnnotatedTree, DerivedProperties)

derivedTree = computeDerived DefaultDerived tree

Run Code Online (Sandbox Code Playgroud)

上面的代码片段几乎无需调整即可调整，AnnotatedTree既包含索引又包含派生的属性。

但是，我不知道从哪里开始代表两棵树之间的映射。根据一些阅读，我有一些不成熟的想法...

定义AnnotatedTree为包含从另一棵树的根到映射节点的路径-编码为每个后续子列表的索引列表，[Integer]
- 使用一个拉链（我目前对此还不太了解）通过路径访问映射的节点（或其父/子/兄弟姐妹）
- 或者也许使用镜头（...我对它的理解还不够清楚！）来做同样的事情
定义AnnotatedTree为直接包含对映射节点的引用，作为Maybe Tree
- 但是然后我看不到步行到映射节点的父级/同级的方法

...但是我确实可以就其中哪些（如果有）值得追求的问题提供一些指导。

任何帮助将非常感激！

Answer 1

pat*_*pat 1

您可以用 id 标记树节点Int，并用拉链绕着它们走动（使用Data.Tree和Data.Tree.Zipper是一个好主意，因为不需要重新发明轮子）。Data.IntMap然后，您可以使用将节点 ID 映射到您想要的任何内容，将辅助属性附加到节点。特别是，您可以创建IntMap从节点 id 到TreePos Full Int该节点的的映射，以便您可以探索该节点的父节点、同级节点和子节点。

归档时间：	6 年，10 月前
查看次数：	185 次
最近记录：	6 年，10 月前