如何在给定的两个二叉搜索树中找到最大的公共子树？

Question

如何在给定的两个二叉搜索树中找到最大的公共子树？

给出了两个BSTs(二叉搜索树).如何在给定的两个中找到最大的公共子树binary trees？

编辑1: 这是我的想法:

设r1 =第1树的当前节点r2 =第2树的当前节点

There are some of the cases I think we need to consider:

Case 1 : r1.data < r2.data
     2 subproblems to solve:
     first, check r1 and r2.left 
     second, check r1.right and r2

Case 2 : r1.data > r2.data
     2 subproblems to solve:
       - first, check r1.left and r2
       - second, check r1 and r2.right

Case 3 : r1.data == r2.data
     Again, 2 cases to consider here:
     (a) current node is part of largest common BST
         compute common subtree size rooted at r1 and r2
     (b)current node is NOT part of largest common BST
         2 subproblems to solve:
         first, solve r1.left and r2.left 
         second, solve r1.right and r2.right

Run Code Online (Sandbox Code Playgroud)

我可以想一想我们需要检查的案例,但截至目前我无法对其进行编码.这不是一个家庭作业问题.看起来像吗？

Answer 1

jon*_*rry 5

只需散列每个节点的子节点和键,然后查找重复项.这将给出线性预期时间算法.例如,请参阅以下伪代码,它假定没有哈希冲突(处理冲突会很简单):

ret = -1
// T is a tree node, H is a hash set, and first is a boolean flag
hashTree(T, H, first):
  if (T is null):
    return 0 // leaf case
  h = hash(hashTree(T.left, H, first), hashTree(T.right, H, first), T.key)
  if (first):
    // store hashes of T1's nodes in the set H
    H.insert(h)
  else:
    // check for hashes of T2's nodes in the set H containing T1's nodes
    if H.contains(h):
      ret = max(ret, size(T)) // size is recursive and memoized to get O(n) total time
  return h

H = {}
hashTree(T1, H, true)
hashTree(T2, H, false)
return ret

Run Code Online (Sandbox Code Playgroud)

请注意,这是假定BST子树的标准定义,即子树由节点及其所有后代组成.

Answer 2

aro*_*oth 3

假设树中没有重复值：

LargestSubtree(Tree tree1, Tree tree2)
    Int bestMatch := 0
    Int bestMatchCount := 0
    For each Node n in tree1  //should iterate breadth-first
        //possible optimization:  we can skip every node that is part of each subtree we find
        Node n2 := BinarySearch(tree2(n.value))
        Int matchCount := CountMatches(n, n2)
        If (matchCount > bestMatchCount)
            bestMatch := n.value
            bestMatchCount := matchCount
        End
    End

    Return ExtractSubtree(BinarySearch(tree1(bestMatch)), BinarySearch(tree2(bestMatch)))
End

CountMatches(Node n1, Node n2)
    If (!n1 || !n2 || n1.value != n2.value)
        Return 0
    End
    Return 1 + CountMatches(n1.left, n2.left) + CountMatches(n1.right, n2.right)
End

ExtractSubtree(Node n1, Node n2)
    If (!n1 || !n2 || n1.value != n2.value)
        Return nil
    End

    Node result := New Node(n1.value)
    result.left := ExtractSubtree(n1.left, n2.left)
    result.right := ExtractSubtree(n1.right, n2.right)
    Return result
End

Run Code Online (Sandbox Code Playgroud)

简单解释一下，这是问题的暴力解决方案。它对第一棵树进行广度优先遍历。对于每个节点，它执行BinarySearch第二棵树的操作来定位该树中的相应节点。然后使用这些节点来评估以该节点为根的公共子树的总大小。如果子树比之前找到的任何子树都大，它会记住它以供以后使用，以便在算法完成时可以构造并返回最大子树的副本。

该算法不处理重复值。BinarySearch可以通过使用返回具有给定值的所有节点的列表（而不是仅单个节点）的实现来扩展它。然后算法可以迭代这个列表并评估每个节点的子树，然后正常进行。

该算法的运行时间为O(n log m)（遍历n第一棵树中的节点，并对log m每个节点执行二分搜索操作），与最常见的排序算法相当。空间复杂度是O(1)在运行时（除了一些临时变量之外没有分配任何内容），以及O(n)当它返回结果时（因为它创建了子树的显式副本，这可能不需要，具体取决于算法应该如何表达其结果））。因此，即使这种暴力方法也应该表现得相当好，尽管正如其他答案所指出的，解决O(n)方案是可能的。

还可以对该算法应用一些可能的优化，例如跳过先前评估的子树中包含的任何节点。因为树遍历是广度优先的，所以我们知道作为某个先前子树一部分的任何节点都不能成为更大子树的根。在某些情况下，这可以显着提高算法的性能，但最坏情况下的运行时间（两棵树没有公共子树）仍然是O(n log m)。

归档时间：	14 年，10 月前
查看次数：	8021 次
最近记录：	10 年，5 月前