KD树,慢树建设

Ian*_*Ian 3 c++ kdtree nearest-neighbor

我正在尝试构建KD Tree(静态情况).我们假设点在x和y坐标上排序.

对于均匀的递归深度,该集合被分成两个子集,其中垂直线穿过中间x坐标.

对于奇数递归深度,该集合被分成两个子集,其中水平线穿过中间y坐标.

可以根据x/y坐标从排序集中确定中值.这一步我在每次拆分之前做的.而且我认为它导致了树的缓慢构造.

  1. 你能帮我检查一下并优化代码吗?
  2. 我找不到第k个最近的邻居,有人可以帮我代码吗?

非常感谢你的帮助和耐心......

请参阅示例代码:

class KDNode
{
private:
Point2D *data;
KDNode *left;
KDNode *right;
    ....
};

void KDTree::createKDTree(Points2DList *pl)
{
//Create list
KDList kd_list;

//Create KD list (all input points)
for (unsigned int i = 0; i < pl->size(); i++)
{
kd_list.push_back((*pl)[i]);
}

//Sort points by x
std::sort(kd_list.begin(), kd_list.end(), sortPoints2DByY());

//Build KD Tree
root = buildKDTree(&kd_list, 1);
}


KDNode * KDTree::buildKDTree(KDList *kd_list, const unsigned int depth)
{
//Build KD tree
const unsigned int n = kd_list->size();

 //No leaf will be built
 if (n == 0)
 {
  return NULL;
 }

 //Only one point: create leaf of KD Tree
 else if (n == 1)
 {
  //Create one leaft
  return new KDNode(new Point2D ((*kd_list)[0]));
 }

 //At least 2 points: create one leaf, split tree into left and right subtree
 else
 {
  //New KD node
  KDNode *node = NULL;

  //Get median index
  const unsigned int median_index = n/2;

  //Create new KD Lists
  KDList kd_list1, kd_list2;

  //The depth is even, process by x coordinate
  if (depth%2 == 0)
  {
   //Create new median node
   node = new KDNode(new Point2D( (*kd_list)[median_index]));

   //Split list
   for (unsigned int i = 0; i < n; i++)
   {
    //Geta actual point
    Point2D *p = &(*kd_list)[i];

    //Add point to the first list: x < median.x
    if (p->getX() < (*kd_list)[median_index].getX())
    {
     kd_list1.push_back(*p);
    }

    //Add point to the second list: x > median.x
    else if (p->getX() > (*kd_list)[median_index].getX())
    {
     kd_list2.push_back(*p);
    }
   }

   //Sort points by y for the next recursion step: slow construction of the tree???
   std::sort(kd_list1.begin(), kd_list1.end(), sortPoints2DByY());
   std::sort(kd_list2.begin(), kd_list2.end(), sortPoints2DByY());

  }

  //The depth is odd, process by y coordinates
  else
  {

   //Create new median node
   node = new KDNode(new Point2D((*kd_list)[median_index]));

   //Split list
   for (unsigned int i = 0; i < n; i++)
   {
    //Geta actual point
    Point2D *p = &(*kd_list)[i];

    //Add point to the first list: y < median.y
    if (p->getY() < (*kd_list)[median_index].getY())
    {
     kd_list1.push_back(*p);
    }

    //Add point to the second list: y < median.y
    else if (p->getY() >(*kd_list)[median_index].getY())
    {
     kd_list2.push_back(*p);
    }
   }

   //Sort points by x for the next recursion step: slow construction of the tree???
   std::sort(kd_list1.begin(), kd_list1.end(), sortPoints2DByX());
   std::sort(kd_list2.begin(), kd_list2.end(), sortPoints2DByX());

  }

  //Build left subtree
  node->setLeft( buildKDTree(&kd_list1, depth +1 ) );

  //Build right subtree
  node->setRight( buildKDTree(&kd_list2, depth + 1 ) );

  //Return new node 
  return node; 
 }
}
Run Code Online (Sandbox Code Playgroud)

Oll*_*aho 5

找到中位数的排序可能是这里最糟糕的罪魁祸首,因为那是O(nlogn),而问题在O(n)时间内是可解的.您应该使用nth_element:http://www.cplusplus.com/reference/algorithm/nth_element/ .这将平均找到线性时间的中位数,之后您可以在线性时间内分割矢量.

向量中的内存管理也需要花费很多时间,特别是对于大向量,因为每次向量的大小加倍时,所有元素都必须被移动.您可以使用向量的保留方法为新创建的节点中的向量保留足够的空间,因此在使用push_back添加新内容时,它们无需动态增加.

如果你绝对需要最好的性能,你应该使用较低级别的代码,取消向量并保留普通数组.第N个元素或"选择"算法很容易获得并且不难写自己:http://en.wikipedia.org/wiki/Selection_algorithm