Zim*_*her 29 quadtree rectangles collision data-structures
我一直在努力将Quadtree添加到我正在编写的程序中,我不禁注意到,我正在寻找的实现很少有很好的解释/执行教程.
具体来说,我正在寻找的一个方法列表和如何实现它们(或只是它们的过程的描述)的方法和伪代码的列表(检索,插入,删除等)是我正在寻找的,以及也许一些提高性能的技巧.这是用于碰撞检测,因此最好用2d矩形来解释,因为它们是将要存储的对象.
Dra*_*rgy 78
All right, I'll take a shot at this. First a teaser to show the results of what I'll propose involving 20,000 agents (just something I whipped up real quick for this specific question):
The GIF has extremely reduced frame rate and significantly lower res to fit the 2 MB maximum for this site. Here's a video if you want to see the thing at close to full speed: https://streamable.com/3pgmn.
And a GIF with 100k though I had to fiddle with it so much and had to turn off the quadtree lines (didn't seem to want to compress as much with them on) as well as change the way the agents looked to get it to fit in 2 megabytes (I wish making a GIF was as easy as coding a quadtree):
The simulation with 20k agents takes ~3 megabytes of RAM. I can also easily handle 100k smaller agents without sacrificing frame rates, though it leads to a bit of a mess on the screen to the point where you can barely see what's going on as in the GIF above. This is all running in just one thread on my i7 and I'm spending almost half the time according to VTune just drawing this stuff on the screen (just using some basic scalar instructions to plot things one pixel at a time in CPU).
And here's a video with 100,000 agents though it's hard to see what's going on. It's kind of a big video and I couldn't find any decent way to compress it without the whole video turning into a mush (might need to download or cache it first to see it stream at reasonable FPS). With 100k agents the simulation takes around 4.5 megabytes of RAM and the memory use is very stable after running the simulation for about 5 seconds (stops going up or down since it ceases to heap allocate). Same thing in slow motion.
Efficient Quadtree for Collision Detection
All right, so actually quadtrees are not my favorite data structure for this purpose. I tend to prefer a grid hierarchy, like a coarse grid for the world, a finer grid for a region, and an even finer grid for a sub-region (3 fixed levels of dense grids, and no trees involved), with row-based optimizations so that a row that has no entities in it will be deallocated and turned into a null pointer, and likewise completely empty regions or sub-regions turned into nulls. While this simple implementation of the quadtree running in one thread can handle 100k agents on my i7 at 60+ FPS, I've implemented grids that can handle a couple million agents bouncing off each other every frame on older hardware (an i3). Also I always liked how grids made it very easy to predict how much memory they'll require, since they don't subdivide cells. But I'll try to cover how to implement a reasonably efficient quadtree.
Note that I won't go into the full theory of the data structure. I'm assuming that you already know that and are interested in improving performance. I'm also just going into my personal way of tackling this problem which seems to outperform most of the solutions I find online for my cases, but there are many decent ways and these solutions are tailor-fitted to my use cases (very large inputs with everything moving every frame for visual FX in films and television). Other people probably optimize for different use cases. When it comes to spatial indexing structures in particular, I really think the efficiency of the solution tells you more about the implementer than the data structure. Also the same strategies I'll propose to speeding things up also apply in 3 dimensions with octrees.
Node Representation
So first of all, let's cover the node representation:
// Represents a node in the quadtree.
struct QuadNode
{
// Points to the first child if this node is a branch or the first
// element if this node is a leaf.
int32_t first_child;
// Stores the number of elements in the leaf or -1 if it this node is
// not a leaf.
int32_t count;
};
Run Code Online (Sandbox Code Playgroud)
It's a total of 8 bytes, and this is very important as it's a key part of the speed. I actually use a smaller one (6 bytes per node) but I'll leave that as an exercise to the reader.
You can probably do without the count. I include that for pathological cases to avoid linearly traversing the elements and counting them each time a leaf node might split. In most common cases a node shouldn't store that many elements. However, I work in visual FX and the pathological cases aren't necessarily rare. You can encounter artists creating content with a boatload of coincident points, massive polygons that span the entire scene, etc, so I end up storing a count.
Where are the AABBs?
So one of the first things people might be wondering is where the bounding boxes (rectangles) are for the nodes. I don't store them. I compute them on the fly. I'm kinda surprised most people don't do that in the code I've seen. For me, they're only stored with the tree structure (basically just one AABB for the root).
That might seem like it'd be more expensive to be computing these on the fly, but reducing the memory use of a node can proportionally reduce cache misses when you traverse the tree, and those cache miss reductions tend to be more significant than having to do a couple of bitshifts and some additions/subtractions during traversal. Traversal looks like this:
static QuadNodeList find_leaves(const Quadtree& tree, const QuadNodeData& root, const int rect[4])
{
QuadNodeList leaves, to_process;
to_process.push_back(root);
while (to_process.size() > 0)
{
const QuadNodeData nd = to_process.pop_back();
// If this node is a leaf, insert it to the list.
if (tree.nodes[nd.index].count != -1)
leaves.push_back(nd);
else
{
// Otherwise push the children that intersect the rectangle.
const int mx = nd.crect[0], my = nd.crect[1];
const int hx = nd.crect[2] >> 1, hy = nd.crect[3] >> 1;
const int fc = tree.nodes[nd.index].first_child;
const int l = mx-hx, t = my-hx, r = mx+hx, b = my+hy;
if (rect[1] <= my)
{
if (rect[0] <= mx)
to_process.push_back(child_data(l,t, hx, hy, fc+0, nd.depth+1));
if (rect[2] > mx)
to_process.push_back(child_data(r,t, hx, hy, fc+1, nd.depth+1));
}
if (rect[3] > my)
{
if (rect[0] <= mx)
to_process.push_back(child_data(l,b, hx, hy, fc+2, nd.depth+1));
if (rect[2] > mx)
to_process.push_back(child_data(r,b, hx, hy, fc+3, nd.depth+1));
}
}
}
return leaves;
}
Run Code Online (Sandbox Code Playgroud)
Omitting the AABBs is one of the most unusual things I do (I keep looking for other people doing it just to find a peer and fail), but I've measured the before and after and it did reduce times considerably, at least on very large inputs, to compact the quadtree node substantially and just compute AABBs on the fly during traversal. Space and time aren't always diametrically opposed. Sometimes reducing space also means reducing time given how much performance is dominated by the memory hierarchy these days. I've even sped up some real world operations applied on massive inputs by compressing the data to quarter the memory use and decompressing on the fly.
I don't know why many people choose to cache the AABBs: whether it's programming convenience or if it's genuinely faster in their cases. Yet for data structures which split evenly down the center like regular quadtrees and octrees, I'd suggest measuring the impact of omitting the AABBs and computing them on the fly. You might be quite surprised. Of course it makes sense to store AABBs for structures that don't split evenly like Kd-trees and BVHs as well as loose quadtrees.
Floating-Point
I don't use floating-point for spatial indexes and maybe that's why I see improved performance just computing AABBs on the fly with right shifts for division by 2 and so forth. That said, at least SPFP seems really fast nowadays. I don't know since I haven't measured the difference. I just use integers by preference even though I'm generally working with floating-point inputs (mesh vertices, particles, etc). I just end up converting them to integer coordinates for the purpose of partitioning and performing spatial queries. I'm not sure if there's any major speed benefit of doing this anymore. It's just a habit and preference since I find it easier to reason about integers without having to think about denormalized FP and all that.
Centered AABBs
While I only store a bounding box for the root, it helps to use a representation that stores a center and half size for nodes while using a left/top/right/bottom representation for queries to minimize the amount of arithmetic involved.
Contiguous Children
This is likewise key, and if we refer back to the node rep:
struct QuadNode
{
int32_t first_child;
...
};
Run Code Online (Sandbox Code Playgroud)
We don't need to store an array of children because all 4 children are contiguous:
first_child+0 = index to 1st child (TL)
first_child+1 = index to 2nd child (TR)
first_child+2 = index to 3nd child (BL)
first_child+3 = index to 4th child (BR)
Run Code Online (Sandbox Code Playgroud)
This not only significantly reduces cache misses on traversal but also allows us to significantly shrink our nodes which further reduces cache misses, storing only one 32-bit index (4 bytes) instead of an array of 4 (16 bytes).
This does mean that if you need to transfer elements to just a couple of quadrants of a parent when it splits, it must still allocate all 4 child leaves to store elements in just two quadrants while having two empty leaves as children. However, the trade-off is more than worth it performance-wise at least in my use cases, and remember that a node only takes 8 bytes given how much we've compacted it.
When deallocating children, we deallocate all four at a time. I do this in constant-time using an indexed free list, like so:
Except we're pooling out memory chunks containing 4 contiguous elements instead of one at a time. This makes it so we usually don't need to involve any heap allocations or deallocations during the simulation. A group of 4 nodes gets marked as freed indivisibly only to then be reclaimed indivisibly in a subsequent split of another leaf node.
Deferred Cleanup
I don't update the quadtree's structure right away upon removing elements. When I remove an element, I just descend down the tree to the child node(s) it occupies and then remove the element, but I don't bother to do anything more just yet even if the leaves become empty.
Instead I do a deferred cleanup like this:
void Quadtree::cleanup()
{
// Only process the root if it's not a leaf.
SmallList<int> to_process;
if (nodes[0].count == -1)
to_process.push_back(0);
while (to_process.size() > 0)
{
const int node_index = to_process.pop_back();
QuadNode& node = nodes[node_index];
// Loop through the children.
int num_empty_leaves = 0;
for (int j=0; j < 4; ++j)
{
const int child_index = node.first_child + j;
const QuadNode& child = nodes[child_index];
// Increment empty leaf count if the child is an empty
// leaf. Otherwise if the child is a branch, add it to
// the stack to be processed in the next iteration.
if (child.count == 0)
++num_empty_leaves;
else if (child.count == -1)
to_process.push_back(child_index);
}
// If all the children were empty leaves, remove them and
// make this node the new empty leaf.
if (num_empty_leaves == 4)
{
// Push all 4 children to the free list.
nodes[node.first_child].first_child = free_node;
free_node = node.first_child;
// Make this node the new empty leaf.
node.first_child = -1;
node.count = 0;
}
}
}
Run Code Online (Sandbox Code Playgroud)
This is called at the end of every single frame after moving all the agents. The reason I do this kind of deferred removal of empty leaf nodes in multiple iterations and not all at once in the process of removing a single element is that element A might move to node N2, making N1 empty. However, element B might, in the same frame, move to N1, making it occupied again.
With the deferred cleanup, we can handle such cases without unnecessarily removing children only to add them right back when another element moves into that quadrant.
Moving elements in my case is a straightforward: 1) remove element, 2) move it, 3) reinsert it to the quadtree. After we move all the elements and at the end of a frame (not time step, there could be multiple time steps per frame), the cleanup function above is called to remove the children from a parent which has 4 empty leaves as children, which effectively turns that parent into the new empty leaf which might then be cleaned up in the next frame with a subsequent cleanup call (or not if things get inserted to it or if the empty leaf's siblings are non-empty).
Let's look at the deferred cleanup visually:
Starting with this, let's say we remove some elements from the tree leaving us with 4 empty leaves:
At this point, if we call cleanup, it will remove 4 leaves if it finds 4 empty child leaves and turn the parent into an empty leaf, like so:
Let's say we remove some more elements:
... and then call cleanup again:
If we call it yet again, we end up with this:
... at which point the root itself turns into an empty leaf. However, the cleanup method never removes the root (it only removes children). Again the main point of doing it deferred this way and in multiple steps is to reduce the amount of potential redundant work that could occur per time step (which can be a lot) if we did this all immediately every single time an element is removed from the tree. It also helps to distribute that works across frames to avoid stutters.
TBH, I originally applied this "deferred cleanup" technique in a DOS game I wrote in C out of sheer laziness! I didn't want to bother with descending down the tree, removing elements, and then removing nodes in a bottom-up fashion back then because I originally wrote the tree to favor top-down traversal (not top-down and back up again) and really thought this lazy solution was a productivity compromise (sacrificing optimal performance to get implemented faster). However, many years later, I actually got around to implementing quadtree removal in ways that immediately started removing nodes and, to my surprise, I actually significantly made it slower with more unpredictable and stuttery frame rates. The deferred cleanup, in spite of originally being inspired by my programmer laziness, was actually (and accidentally) a very effective optimization for dynamic scenes.
Singly-Linked Index Lists for Elements
For elements, I use this representation:
// Represents an element in the quadtree.
struct QuadElt
{
// Stores the ID for the element (can be used to
// refer to external data).
int id;
// Stores the rectangle for the element.
int x1, y1, x2, y2;
};
// Represents an element node in the quadtree.
struct QuadEltNode
{
// Points to the next element in the leaf node. A value of -1
// indicates the end of the list.
int next;
// Stores the element index.
int element;
};
Run Code Online (Sandbox Code Playgroud)
I use an "element node" which is separate from "element". An element is only inserted once to the quadtree no matter how many cells it occupies. However, for each cell it occupies, an "element node" is inserted which indexes that element.
The element node is a singly-linked index list node into an array, and also using the free list method above. This incurs some more cache misses over storing all the elements contiguously for a leaf. However, given that this quadtree is for very dynamic data which is moving and colliding every single time step, it generally takes more processing time than it saves to seek out a perfectly contiguous representation for the leaf elements (you would effectively have to implement a variable-sized memory allocator which is really fast, and that's far from an easy thing to do). So I use the singly-linked index list which allows a free list constant-time approach to allocation/deallocation. When you use that representation, you can transfer elements from split parents to new leaves by just changing a few integers.
SmallList<T>
Oh, I should mention this. Naturally it helps if you don't heap allocate just to store a temporary stack of nodes for non-recursive traversal. SmallList<T> is similar to vector<T> except it won't involve a heap allocation until you insert more than 128 elements to it. It's similar to SBO string optimizations in the C++ standard lib. It's something I implemented and have been using for ages and it does help a lot to make sure you use the stack whenever possible.
Tree Representation
Here's the representation of the quadtree itself:
struct Quadtree
{
// Stores all the elements in the quadtree.
FreeList<QuadElt> elts;
// Stores all the element nodes in the quadtree.
FreeList<QuadEltNode> elt_nodes;
// Stores all the nodes in the quadtree. The first node in this
// sequence is always the root.
std::vector<QuadNode> nodes;
// Stores the quadtree extents.
QuadCRect root_rect;
// Stores the first free node in the quadtree to be reclaimed as 4
// contiguous nodes at once. A value of -1 indicates that the free
// list is empty, at which point we simply insert 4 nodes to the
// back of the nodes array.
int free_node;
// Stores the maximum depth allowed for the quadtree.
int max_depth;
};
Run Code Online (Sandbox Code Playgroud)
As pointed out above, we store a single rectangle for the root (root_rect). All sub-rects are computed on the fly. All nodes are stored in contiguously in an array (std::vector<QuadNode>) along with the elements and element nodes (in FreeList<T>).
FreeList<T>
I use a FreeList data structure which is basically an array (and random-access sequence) that lets you remove elements from anywhere in constant-time (leaving holes behind which get reclaimed upon subsequent insertions in constant-time). Here's a simplified version which doesn't bother with handling non-trivial data types (doesn't use placement new or manual destruction calls):
/// Provides an indexed free list with constant-time removals from anywhere
/// in the list without invalidating indices. T must be trivially constructible
/// and destructible.
template <class T>
class FreeList
{
public:
/// Creates a new free list.
FreeList();
/// Inserts an element to the free list and returns an index to it.
int insert(const T& element);
// Removes the nth element from the free list.
void erase(int n);
// Removes all elements from the free list.
void clear();
// Returns the range of valid indices.
int range() const;
// Returns the nth element.
T& operator[](int n);
// Returns the nth element.
const T& operator[](int n) const;
private:
union FreeElement
{
T element;
int next;
};
std::vector<FreeElement> data;
int first_free;
};
template <class T>
FreeList<T>::FreeList(): first_free(-1)
{
}
template <class T>
int FreeList<T>::insert(const T& element)
{
if (first_free != -1)
{
const int index = first_free;
first_free = data[first_free].next;
data[index].element = element;
return index;
}
else
{
FreeElement fe;
fe.element = element;
data.push_back(fe);
return static_cast<int>(data.size() - 1);
}
}
template <class T>
void FreeList<T>::erase(int n)
{
data[n].next = first_free;
first_free = n;
}
template <class T>
void FreeList<T>::clear()
{
data.clear();
first_free = -1;
}
template <class T>
int FreeList<T>::range() const
{
return static_cast<int>(data.size());
}
template <class T>
T& FreeList<T>::operator[](int n)
{
return data[n].element;
}
template <class T>
const T& FreeList<T>::operator[](int n) const
{
return data[n].element;
}
Run Code Online (Sandbox Code Playgroud)
I have one which does work with non-trivial types and provides iterators and so forth but is much more involved. These days I tend to work more with trivially constructible/destructible C-style structs anyway (only using non-trivial user-defined types for high-level stuff).
Maximum Tree Depth
I prevent the tree from subdividing too much by specifying a max depth allowed. For the quick simulation I whipped up, I used 8. For me this is crucial since again, in VFX I encounter pathological cases a lot including content created by artists with lots of coincident or overlapping elements which, without a maximum tree depth limit, could want it to subdivide indefinitely.
There is a bit of fine-tuning if you want optimal performance with respect to max depth allowed and how many elements you allow to be stored in a leaf before it splits into 4 children. I tend to find the optimal results are gained with something around 8 elements max per node before it splits, and a max depth set so that the smallest cell size is a little over the size of your average agent (otherwise more single agents could be inserted into multiple leaves).
Collision and Queries
There are a couple of ways to do the collision detection and queries. I often see people do it like this:
for each element in scene:
use quad tree to check for collision against other elements
Run Code Online (Sandbox Code Playgroud)
This is very straightforward but the problem with this approach is that the first element in the scene might be in a totally different location in the world from the second. As a result, the paths we take down the quadtree could be totally sporadic. We might traverse one path to a leaf and then want to go down that same path again for the first element as, say, the 50,000th element. By this time the nodes involved in that path may have already been evicted from the CPU cache. So I recommend doing it this way:
traversed = {}
gather quadtree leaves
for each leaf in leaves:
{
for each element in leaf:
{
if not traversed[element]:
{
use quad tree to check for collision against other elements
traversed[element] = true
}
}
}
Run Code Online (Sandbox Code Playgroud)
While that's quite a bit more code and requires we keep a traversed bitset or parallel array of some sort to avoid processing elements twice (since they may be inserted in more than one leaf), it helps make sure that we descend the same paths down the quadtree throughout the loop. That helps keep things much more cache-friendly. Also if after attempting to move the element in the time step, it's still encompassed entirely in that leaf node, we don't even need to work our way back up again from the root (we can just check that one leaf only).
Common Inefficiencies: Things to Avoid
While there are many ways to skin the cat and achieve an efficient solution, there is a common way to achieve a very inefficient solution. And I've encountered my share of very inefficient quadtrees, kd trees, and octrees in my career working in VFX. We're talking over a gigabyte of memory use just to partition a mesh with 100k triangles while taking 30 secs to build, when a decent implementation should be able to do the same hundreds of times a second and just take a few megs. There are many people whipping these up to solve problems who are theoretical wizards but didn't pay much attention to memory efficiency.
So the absolute most common no-no I see is to store one or more full-blown containers with each tree node. By full-blown container, I mean something that owns and allocates and frees its own memory, like this:
struct Node
{
...
// Stores the elements in the node.
List<Element> elements;
};
Run Code Online (Sandbox Code Playgroud)
And List<Element> might be a list in Python, an ArrayList in Java or C#, std::vector in C++, etc: some data structure that manages its own memory/resources.
The problem here is that while such containers are very efficiently implemented for storing a large number of elements, all of them in all languages are extremely inefficient if you instantiate a boatload of them only to store a few elements in each one. One of the reasons is that the container metadata tends to be quite explosive in memory usage at such a granular level of a single tree node. A container might need to store a size, capacity, a pointer/reference to data it allocates, etc, and all for a generalized purpose so it might use 64-bit integers for size and capacity. As a result the metadata just for an empty container might be 24 bytes which is already 3 times larger than the entirety of the node representation I proposed, and that's just for an empty container designed to store elements in leaves.
Furthermore each container often wants to either heap/GC-allocate on insertion or require even more preallocated memory in advance (at which point it might take 64 bytes just for the container itself). So that either becomes slow because of all the allocations (note that GC allocations are really fast initially in some implementations like JVM, but that's only for the initial burst Eden cycle) or because we're incurring a boatload of cache misses because the nodes are so huge that barely any fit into the lower levels of the CPU cache on traversal, or both.
Yet this is a very natural inclination and makes intuitive sense since we talk about these structures theoretically using language like, "Elements are stored in the leaves" which suggests storing a container of elements in leaf nodes. Unfortunately it has an explosive cost in terms of memory use and processing. So avoid this if the desire is to create something reasonably efficient. Make the Node share and point to (refer to) or index memory allocated and stored for the entire tree, not for every single individual node. In actuality the elements shouldn't be stored in the leaves.
Elements should be stored in the tree and leaf nodes should index or point to those elements.
Conclusion
Phew, so those are the main things I do to achieve what is hopefully considered a decent-performing solution. I hope that helps. Note that I am aiming this at a somewhat advanced level for people who have already implemented quadtrees at least once or twice. If you have any questions, feel free to shoot.
Since this question is a bit broad, I might come and edit it and keep tweaking and expanding it over time if it doesn't get closed (I love these types of questions since they give us an excuse to write about our experiences working in the field, but the site doesn't always like them). I'm also hoping some experts might jump in with alternative solutions I can learn from and perhaps use to improve mine further.
Again quadtrees aren't actually my favorite data structure for extremely dynamic collision scenarios like this. So I probably have a thing or two to learn from people who do favor quadtrees for this purpose and have been tweaking and tuning them for years. Mostly I use quadtrees for static data that doesn't move around every frame, and for those I use a very different representation from the one proposed above.
Dra*_*rgy 21
2.基础知识
对于这个答案(对不起,我再次跑出了角色限制),我将更多地关注针对这些结构新手的基础知识.
好吧,让我们说我们在太空中有很多这样的元素:
我们想要找出鼠标光标下面的元素,或者哪些元素相互交叉/碰撞,或者最近元素与另一个元素是什么,或者是这种类型的东西.
在这种情况下,如果我们拥有的唯一数据是空间中的一堆元素位置和大小/半径,我们必须遍历所有内容以找出给定搜索区域中的元素.对于碰撞检测,我们必须循环遍历每个元素,然后,对于每个元素,循环遍历所有其他元素,使其成为爆炸性二次复杂度算法.这不会阻碍非平凡的输入大小.
细分
那么我们可以对这个问题做些什么呢?一种直接的方法是将搜索空间(例如屏幕)细分为固定数量的单元格,如下所示:
现在假设您要在鼠标光标位置找到元素(cx, cy).在这种情况下,我们所要做的就是检查鼠标光标下单元格中的元素:
grid_x = floor(cx / cell_size);
grid_y = floor(cy / cell_size);
for each element in cell(grid_x, grid_y):
{
if element is under cx,cy:
do something with element (hover highlight it, e.g)
}
Run Code Online (Sandbox Code Playgroud)
碰撞检测也类似.如果我们想看看哪些元素与给定元素相交(碰撞):
grid_x1 = floor(element.x1 / cell_size);
grid_y1 = floor(element.y1 / cell_size);
grid_x2 = floor(element.x2 / cell_size);
grid_y2 = floor(element.y2 / cell_size);
for grid_y = grid_y1, grid_y2:
{
for grid_x = grid_x1, grid_x2:
{
for each other_element in cell(grid_x, grid_y):
{
if element != other_element and collide(element, other_element):
{
// The two elements intersect. Do something in response
// to the collision.
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
我建议人们通过将空间/屏幕分成固定数量的网格单元(如10x10,或100x100,甚至1000x1000)来启动.有些人可能会认为1000x1000在内存使用方面会有爆炸性,但你可以让每个单元只需要4位字节和32位整数,如下所示:
...在这一点上,甚至一百万个小区占用不到4兆字节.
固定分辨率网格的缺点
如果你问我(我个人喜欢的碰撞检测),固定分辨率网格是一个很棒的数据结构,但它确实有一些缺点.
想象一下,你有一个指环王视频游戏.假设你的许多单位都是地图上的小单位,如人类,兽人和精灵.但是,你也有一些巨大的单位,如龙和蚂蚁.
这里固定分辨率网格的一个问题是,虽然你的单元格大小可能是最佳的存储像人类和精灵和兽人这样的小单位,大多数时间只占用1个单元格,但龙和蚂蚁等巨大的家伙可能想要占据许多细胞,比如400个细胞(20x20).因此,我们必须将这些大家伙插入许多单元格并存储大量冗余数据.
另外,假设您要搜索地图的大矩形区域以查找感兴趣的单位.在这种情况下,您可能需要检查更多的单元格而不是理论上的最佳单元格.
这是固定分辨率网格*的主要缺点.我们最终可能不得不插入大量的东西并将它们存储到比理想情况下要存储的更多的单元格中.对于大型搜索区域,我们可能必须检查比我们理想的搜索要多得多的单元格.
- 也就是说,抛开理论,通常你可以采用与图像处理类似的缓存方式来处理网格.因此,虽然它具有这些理论上的缺点,但实际上,实现缓存友好的遍历模式的简单性和易用性可以使网格比听起来好很多.
四叉树
所以四叉树是解决这个问题的一种方法.可以说,它们不是使用固定分辨率网格,而是根据某些标准调整分辨率,同时细分/分割成4个子单元格以提高分辨率.例如,我们可能会说如果给定单元格中有超过2个子节点,则单元格应该拆分.在这种情况下,这:
结束这个:
现在我们有一个非常好的表示,没有单元格存储超过2个元素.同时让我们考虑如果我们插入一条巨龙会发生什么:
这里,与固定分辨率网格不同,龙只能插入一个单元格,因为它占据的单元格中只有一个元素.同样,如果我们搜索地图的一个大区域,除非占据单元格的元素很多,否则我们不必检查这么多单元格.
履行
那么我们如何实现其中的一个呢?嗯,这是一棵树,在一天结束时,还有一棵四棵树.因此,我们从具有4个子节点的根节点的概念开始,但它们当前为null/nil,而root目前也是一个叶子:
插入
让我们开始插入一些元素,并且为了简单起见,假设节点在具有2个以上的元素时应该分割.所以我们将插入一个元素,当我们插入一个元素时,我们应该将它插入它所属的叶子(单元格).在这种情况下,我们只有一个,根节点/单元格,所以我们将其插入:
...让我们插入另一个:
......还有另一个:
现在我们在叶节点中有两个以上的元素.它现在应该拆分.此时,我们为叶节点(在本例中为根)创建了4个子节点,然后根据每个元素在空间中占据的区域/单元格将要分割的叶子(根)中的元素转移到适当的象限中:
让我们插入另一个元素,并再次插入它所属的适当叶子:
......和另一个:
现在我们又在叶子中有两个以上的元素,所以我们应该把它分成4个象限(子):
这是基本的想法.您可能注意到的一件事是,当我们插入的元素不是无限小的点时,它们很容易重叠多个单元/节点.
因此,如果我们有许多元素重叠细胞之间的许多边界,他们最终可能想要细分一大堆,可能是无限的.为了缓解这个问题,有些人选择拆分元素.如果你与一个元素相关联的所有东西都是一个矩形,那么将矩形切块是相当简单的.其他人可能只对树可以分割的数量设置深度/递归限制.我倾向于选择后两种解决方案用于这两者之间的碰撞检测方案,因为我发现它至少更容易实现更有效.但是,还有另一种选择:松散的表示形式,并且将在不同的部分中介绍.
此外,如果你的元素彼此重叠,那么即使你存储的是无限小的点,你的树也可能想要无限分裂.例如,如果你在空间中有25个点在彼此的顶部(我经常在VFX中遇到的场景),那么无论如何,你的树都会想要无限期地分裂而没有递归/深度限制.因此,要处理病态情况,即使您对元素进行切块,也可能需要深度限制.
删除元素
删除元素包含在第一个答案中,同时删除节点以清理树并删除空叶.但基本上我们所做的就是使用我提出的方法删除元素只是从树下降到存储元素的叶子/叶子(例如,可以使用其矩形确定),并从那些叶子中移除它.
然后,为了开始删除空叶节点,我们使用原始答案中涵盖的延迟清理方法.
结论
我的时间不够,但会尝试回到这个并继续改进答案.如果人们想要练习,我建议实现一个普通的旧固定分辨率网格,看看你是否可以将它降低到每个单元格只是一个32位整数的位置.在考虑四叉树之前首先要理解网格及其固有问题,你可能对网格很好.它甚至可以为您提供最佳解决方案,具体取决于您实现网格与四叉树的效率.
Dra*_*rgy 18
上面是一个小GIF,显示500,000个代理每次使用一个新的"松散/紧密网格"数据结构相互反弹我在写完关于松散四叉树的答案后受到启发.我会试着了解它是如何工作的.
到目前为止,我所展示的所有已实现的数据结构(尽管它可能只是我)是最好的数据结构,处理50万个代理比最初的四叉树处理100k更好,并且比松散的更好四叉树处理250k.它还需要最少的内存,并且在这三者中使用最稳定的内存.这仍然只在一个线程中工作,没有SIMD代码,没有花哨的微优化,因为我通常申请生产代码 - 只需几个小时的工作就可以直接实现.
我还改进了绘图瓶颈而没有改进我的光栅化代码.这是因为网格让我可以轻松地以一种对图像处理缓存友好的方式遍历它(逐个地绘制网格单元格中的元素恰好会在光栅化时产生非常缓存友好的图像处理模式).
有趣的是,它也花了我最短的时间来实现(只有2小时,而我在松散的四叉树上花了5或6个小时),它也需要最少的代码(并且可以说是最简单的代码).虽然这可能只是因为我积累了很多实施网格的经验.
所以我介绍了网格在基础部分中的工作原理(参见第2部分),但这是一个"松散的网格".每个网格单元存储其自己的边界框,当元素被移除时允许收缩,并随着元素的添加而增长.因此,每个元素只需要根据其中心位置落入哪个单元格插入网格中,如下所示:
// Ideally use multiplication here with inv_cell_w or inv_cell_h.
int cell_x = clamp(floor(elt_x / cell_w), 0, num_cols-1);
int cell_y = clamp(floor(ely_y / cell_h), 0, num_rows-1);
int cell_idx = cell_y*num_rows + cell_x;
// Insert element to cell at 'cell_idx' and expand the loose cell's AABB.
Run Code Online (Sandbox Code Playgroud)
单元格存储元素和AABB如下:
struct LGridLooseCell
{
// Stores the index to the first element using an indexed SLL.
int head;
// Stores the extents of the grid cell relative to the upper-left corner
// of the grid which expands and shrinks with the elements inserted and
// removed.
float l, t, r, b;
};
Run Code Online (Sandbox Code Playgroud)
然而,松散的细胞构成了概念问题.如果我们插入一个巨大的元素,他们有这些可以变大的边界框,当我们想要找出哪个松散的单元格和元素与搜索矩形相交时,我们如何避免检查网格的每个单个单元格,例如?
实际上这是一个"双松散网格".松散的网格单元本身被分隔成紧密的网格.当我们进行上面的模拟搜索时,我们首先看一下紧密网格,如下所示:
tx1 = clamp(floor(search_x1 / cell_w), 0, num_cols-1);
tx2 = clamp(floor(search_x2 / cell_w), 0, num_cols-1);
ty1 = clamp(floor(search_y1 / cell_h), 0, num_rows-1);
ty2 = clamp(floor(search_y2 / cell_h), 0, num_rows-1);
for ty = ty1, ty2:
{
trow = ty * num_cols
for tx = tx1, tx2:
{
tight_cell = tight_cells[trow + tx];
for each loose_cell in tight_cell:
{
if loose_cell intersects search area:
{
for each element in loose_cell:
{
if element intersects search area:
add element to query results
}
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
紧密细胞存储松散细胞的单链接索引列表,如下所示:
struct LGridLooseCellNode
{
// Points to the next loose cell node in the tight cell.
int next;
// Stores an index to the loose cell.
int cell_idx;
};
struct LGridTightCell
{
// Stores the index to the first loose cell node in the tight cell using
// an indexed SLL.
int head;
};
Run Code Online (Sandbox Code Playgroud)
瞧,这是"松散的双网格"的基本思想.当我们插入一个元素时,我们就像扩展四叉树一样扩展松散单元格的AABB.但是,我们还根据矩形将松散的单元格插入到紧密网格中(并且可以将其插入多个单元格中).
这两者的组合(紧密网格以快速找到松散的细胞,松散的细胞以快速找到元素)提供了一个非常可爱的数据结构,其中每个元素被插入到具有恒定时间搜索,插入和移除的单个单元格中.
我看到的唯一一大缺点是我们必须存储所有这些单元格,并且可能仍然需要搜索比我们需要的更多单元格,但它们仍然相当便宜(在我的情况下每个单元格20个字节)并且很容易遍历以非常缓存友好的访问模式搜索单元格.
我建议尝试一下这种"松散网格"的想法.它实际上比四叉树和松散四叉树更容易实现,更重要的是,优化,因为它立即适用于缓存友好的内存布局.作为一个超级酷的奖励,如果你可以提前预测你的世界中的代理人数量,那么它在内存使用方面是100%完全稳定的,因为一个元素总是占据一个单元格,并且单元格总数是固定(因为他们没有细分/拆分).结果,内存使用非常稳定.对于您希望事先预先分配所有内存并且知道内存使用永远不会超过该点的某些硬件和软件,这可能是一个巨大的奖励.
使用SIMD同时使用矢量化代码(除了多线程)进行多个连贯查询也非常容易,因为如果我们甚至可以称之为遍历,那么遍历(它只是一个恒定时间查找到细胞指数涉及一些算术).因此,应用类似于射线数据包的优化策略相当容易,英特尔应用于其光线跟踪内核/ BVH(Embree)以同时测试多个相干光线(在我们的情况下,它们将是"代理数据包"用于冲突),除非没有这种奇特/复杂的代码,因为网格"遍历"是如此简单.
论记忆的使用与效率
我在高效四叉树的第1部分中略微介绍了这一点,但减少内存使用通常是加速这些天的关键,因为一旦你将数据输入L1或寄存器,我们的处理器速度如此之快,但DRAM访问相对如此, 太慢了.即使我们有一个疯狂的慢速记忆,我们仍然有如此珍贵的快速记忆.
我觉得我有点幸运,因为我们必须非常节俭地使用内存(虽然没有我之前的人那么多),甚至一兆兆字节的DRAM都被认为是惊人的.当时我学到的一些东西和我接受的习惯(即使我远离专家)恰巧与表现保持一致.
因此,我提供的关于效率的一般建议,不仅仅是用于碰撞检测的空间索引,而是关注内存使用.如果它具有爆炸性,那么解决方案可能效率不高.当然,有一个灰色区域,为数据结构使用更多的内存可以大大减少处理到只考虑速度有利的程度,但很多时候减少了数据结构所需的内存量,特别是"热内存" "反复访问,可以与速度改进成比例地转换.我职业生涯中遇到的所有效率最低的空间索引都是内存使用中最具爆炸性的.
查看您需要存储和计算的数据量是有帮助的,至少大致是,理想情况下需要多少内存.然后将其与实际需求量进行比较.如果这两者是分开的,那么你可能会得到一个相当大的提升来减少内存使用,因为这通常会减少从内存层次结构中较慢形式的内存中加载内存块的时间.
Dra*_*rgy 15
好吧,我想花一些时间来实现和解释松散的四叉树,因为我觉得它们非常有趣,甚至可能是最广泛的涉及非常动态场景的用例.
所以我昨晚最终实现了一个,并花了一些时间调整和调整和分析它.这是一个预告片,有25万个动态代理,每个时间步都相互移动和反弹:
当我缩小以查看所有25万个代理以及松散四叉树的所有边界矩形时,帧速率开始受到影响,但这主要是由于我的绘图功能中的瓶颈.如果我缩小一次在屏幕上绘制所有内容,它们就会开始成为热点,我根本不打算优化它们.以下是几乎没有代理在基本级别上工作的方式:
松散四叉树
好吧,那么松散的四叉树是什么?它们基本上是四叉树,其节点不能完全从中心分割成四个偶数象限.相反,他们的AABB(边界矩形)可以重叠,并且可以更大,甚至更小,如果你将一个节点从中心完全分割成4个象限,你可以得到它们.
所以在这种情况下,我们绝对必须存储每个节点的边界框,因此我将其表示为:
struct LooseQuadNode
{
// Stores the AABB of the node.
float rect[4];
// Stores the negative index to the first child for branches or the
// positive index to the element list for leaves.
int children;
};
Run Code Online (Sandbox Code Playgroud)
这次我尝试使用单精度浮点来查看它的表现如何,并且它做得非常不错.
重点是什么?
好的,那有什么意义呢?使用松散四叉树可以利用的主要内容是,为了插入和移除,您可以将插入四叉树的每个元素视为单个点.因此,元素永远不会插入到整个树中的多个叶节点中,因为它被视为无限小的点.
但是,当我们将这些"元素点"插入到树中时,我们会扩展插入的每个节点的边界框,以包含元素的边界(例如,元素的矩形).这使我们可以在进行搜索查询时可靠地找到这些元素(例如:搜索与矩形或圆形区域相交的所有元素).
优点:
缺点:
昂贵的查询
对于静态元素,这第一个con会非常可怕,因为我们所做的就是构建树并在这些情况下搜索它.我发现这个松散的四叉树,虽然花了几个小时调整和调整它,但是查询它有一个巨大的热点:
也就是说,这实际上是我迄今为止动态场景中四面体的"个人最佳"实现(尽管我记住我喜欢用于此目的的分层网格,并且没有那么多使用四边形用于动态场景的经验),尽管这个明显的骗局.这是因为至少对于动态场景,我们必须每隔一个步骤不断移动元素,因此与树有很多关系,而不仅仅是查询它.它必须一直在更新,这实际上做得相当不错.
我喜欢松散的四叉树,即使你拥有大量元素以及最少量的元素,你也能感觉到安全.巨大的元素不会比小的元素占用更多的内存.因此,如果我正在编写一个具有大规模世界的视频游戏,并希望将所有内容都放入一个中心空间索引来加速所有事情,而不像我通常那样烦扰数据结构的层次结构,那么松散的四叉树和松散的八叉树可能是完美的平衡为"一个中心的通用数据结构,如果我们将只使用一个整个动态世界".
记忆用法
在内存使用方面,虽然元素占用的内存较少(特别是大量内存),但与节点甚至不需要存储AABB的实现相比,节点需要更多.我总体上发现了各种测试用例,包括那些具有许多巨大元素的测试用例,松散的四叉树往往会因其强壮的节点而占用更多的内存(粗略估计约为33%).也就是说,它在我的原始答案中表现优于四叉树实现.
从好的方面来说,内存使用更稳定(这往往会转化为更稳定和平滑的帧速率).我的原始答案的四叉树在内存使用变得非常稳定之前大约花了5秒多.这个在启动之后往往会变得稳定一两秒,并且很可能是因为元素永远不必被插入多次(即使小元素可以在原始四叉树中插入两次,如果它们重叠两个或更多节点在边界).因此,数据结构可以快速发现针对所有情况分配的所需内存量,可以这么说.
理论
让我们来介绍一下基本理论.我建议首先实现一个常规的四叉树并在转向松散版本之前理解它,因为它们实现起来有点困难.当我们从一棵空树开始时,你可以想象它也有一个空的矩形.
让我们插入一个元素:
由于我们目前只有一个根节点也是一个叶子,我们只需将其插入即可.执行此操作时,根节点的先前空矩形现在包含我们插入的元素(以红色虚线显示).让我们插入另一个:
我们在插入的元素的AABB插入(这次只是根)时,扩展我们遍历的节点的AABB.让我们插入另一个,让我们说当节点包含2个以上的元素时,它们应该分开:
在这种情况下,我们在叶节点(我们的根)中有超过2个元素,因此我们应该将它分成4个象限.这与分割常规的基于点的四叉树非常相似,除了我们再次扩展边界框时我们转移子节点.我们首先考虑被拆分节点的中心位置:
现在我们有4个子节点到我们的根节点,每个子节点也存储它的紧身边界框(以绿色显示).让我们插入另一个元素:
在这里你可以看到插入这个元素不仅扩展了左下角子项的矩形,还扩展了根(我们沿着我们插入的路径展开所有AABB).让我们插入另一个:
在这种情况下,我们在叶节点中再次有3个元素,因此我们应该拆分:
就像这样.那左下角的圆圈怎么样?它似乎与两个象限交叉.但是,我们只考虑元素的一个点(例如:它的中心)来确定它所属的象限.因此,圆圈实际上只会插入到左下象限.
然而,左下象限的边界框被扩展到包含其范围(以青色显示,并希望你们不介意,但我改变了BG颜色,因为它很难看到颜色),所以AABBs的级别2的节点(以青色显示)实际上溢出到彼此的象限中.
每个象限存储其自己的矩形,总是保证包含其元素的事实允许我们将元素插入到一个叶节点,即使其区域与多个节点相交.相反,我们扩展叶节点的边界框,而不是将元素插入到多个节点.
更新AABB
因此,这可能会导致问题,AABB何时更新?如果我们只在插入元素时扩展AABB,它们只会越来越大.当元素被删除时,我们如何缩小它们?有很多方法可以解决这个问题,但我通过在原始答案中描述的"清理"方法更新整个层次结构的边界框来实现.这似乎足够快(甚至不显示为热点).
与网格比较
我似乎仍然无法像我的分层网格实现一样有效地实现碰撞检测,但同样可能仅仅比数据结构更重要.我发现树结构的主要困难是可以轻松控制内存中的所有内容以及访问方式.例如,通过网格,您可以确保一行中的所有列都是连续的并按顺序排列,并确保以连续的方式访问它们以及连续存储在该行中的元素.对于一棵树,内存访问本质上往往有点零星,并且随着节点被分割成多个子节点,树木想要更频繁地传递元素,因此往往会迅速降级.也就是说,如果我想使用一个树的空间索引,到目前为止,我真的正在挖掘这些松散的变体,并且想到了实现"松散网格"的想法.
结论
简而言之,这是松散的四叉树,它基本上具有常规四叉树的插入/移除逻辑,只存储点,除了它在路上扩展/更新AABB.为了搜索,我们最终遍历其矩形与搜索区域相交的所有子节点.
我希望人们不介意我发布这么多冗长的答案.我真的很喜欢写它们,这对我来说是一个有用的练习,可以重新审视四叉树,试图写出所有这些答案.我也在考虑一本关于这些主题的书(尽管它会用日语),并在这里写一些答案,而仓促和英语,帮助我把所有东西放在我的脑中.现在我只需要有人要求解释如何编写有效的八叉树或网格以进行碰撞检测,以便为我在这些主题上做同样的事情提供借口.
Dra*_*rgy 12
我希望人们不介意另一个答案,但我没有达到30k的限制.我今天在想我的第一个答案是如何与语言无关.我在谈论内存分配策略,类模板等,并非所有语言都允许这样的事情.
所以我花了一些时间考虑一种几乎普遍适用的有效实现(例外是函数式语言).所以我最终将我的四叉树移植到C,以便它所需要的只是int存储所有内容的数组.
The result isn't pretty but should work very efficiently on any language that allows you to store contiguous arrays of int. For Python there are libs like ndarray in numpy. For JS there are typed arrays. For Java and C#, we can use int arrays (not Integer, those are not guaranteed to be stored contiguously and they use a lot more mem than plain old int).
C IntList
So I use one auxiliary structure built on int arrays for the entire quadtree to make it as easy as possible to port to other languages. It combines a stack/free list. This is all we need to implement everything talked about in the other answer in an efficient way.
#ifndef INT_LIST_H
#define INT_LIST_H
#ifdef __cplusplus
#define IL_FUNC extern "C"
#else
#define IL_FUNC
#endif
typedef struct IntList IntList;
enum {il_fixed_cap = 128};
struct IntList
{
// Stores a fixed-size buffer in advance to avoid requiring
// a heap allocation until we run out of space.
int fixed[il_fixed_cap];
// Points to the buffer used by the list. Initially this will
// point to 'fixed'.
int* data;
// Stores how many integer fields each element has.
int num_fields;
// Stores the number of elements in the list.
int num;
// Stores the capacity of the array.
int cap;
// Stores an index to the free element or -1 if the free list
// is empty.
int free_element;
};
// ---------------------------------------------------------------------------------
// List Interface
// ---------------------------------------------------------------------------------
// Creates a new list of elements which each consist of integer fields.
// 'num_fields' specifies the number of integer fields each element has.
IL_FUNC void il_create(IntList* il, int num_fields);
// Destroys the specified list.
IL_FUNC void il_destroy(IntList* il);
// Returns the number of elements in the list.
IL_FUNC int il_size(const IntList* il);
// Returns the value of the specified field for the nth element.
IL_FUNC int il_get(const IntList* il, int n, int field);
// Sets the value of the specified field for the nth element.
IL_FUNC void il_set(IntList* il, int n, int field, int val);
// Clears the specified list, making it empty.
IL_FUNC void il_clear(IntList* il);
// ---------------------------------------------------------------------------------
// Stack Interface (do not mix with free list usage; use one or the other)
// ---------------------------------------------------------------------------------
// Inserts an element to the back of the list and returns an index to it.
IL_FUNC int il_push_back(IntList* il);
// Removes the element at the back of the list.
IL_FUNC void il_pop_back(IntList* il);
// ---------------------------------------------------------------------------------
// Free List Interface (do not mix with stack usage; use one or the other)
// ---------------------------------------------------------------------------------
// Inserts an element to a vacant position in the list and returns an index to it.
IL_FUNC int il_insert(IntList* il);
// Removes the nth element in the list.
IL_FUNC void il_erase(IntList* il, int n);
#endif
#include "IntList.h"
#include <stdlib.h>
#include <string.h>
#include <assert.h>
void il_create(IntList* il, int num_fields)
{
il->data = il->fixed;
il->num = 0;
il->cap = il_fixed_cap;
il->num_fields = num_fields;
il->free_element = -1;
}
void il_destroy(IntList* il)
{
// Free the buffer only if it was heap allocated.
if (il->data != il->fixed)
free(il->data);
}
void il_clear(IntList* il)
{
il->num = 0;
il->free_element = -1;
}
int il_size(const IntList* il)
{
return il->num;
}
int il_get(const IntList* il, int n, int field)
{
assert(n >= 0 && n < il->num);
return il->data[n*il->num_fields + field];
}
void il_set(IntList* il, int n, int field, int val)
{
assert(n >= 0 && n < il->num);
il->data[n*il->num_fields + field] = val;
}
int il_push_back(IntList* il)
{
const int new_pos = (il->num+1) * il->num_fields;
// If the list is full, we need to reallocate the buffer to make room
// for the new element.
if (new_pos > il->cap)
{
// Use double the size for the new capacity.
const int new_cap = new_pos * 2;
// If we're pointing to the fixed buffer, allocate a new array on the
// heap and copy the fixed buffer contents to it.
if (il->cap == il_fixed_cap)
{
il->data = malloc(new_cap * sizeof(*il->data));
memcpy(il->data, il->fixed, sizeof(il->fixed));
}
else
{
// Otherwise reallocate the heap buffer to the new size.
il->data = realloc(il->data, new_cap * sizeof(*il->data));
}
// Set the old capacity to the new capacity.
il->cap = new_cap;
}
return il->num++;
}
void il_pop_back(IntList* il)
{
// Just decrement the list size.
assert(il->num > 0);
--il->num;
}
int il_insert(IntList* il)
{
// If there's a free index in the free list, pop that and use it.
if (il->free_element != -1)
{
const int index = il->free_element;
const int pos = index * il->num_fields;
// Set the free index to the next free index.
il->free_element = il->data[pos];
// Return the free index.
return index;
}
// Otherwise insert to the back of the array.
return il_push_back(il);
}
void il_erase(IntList* il, int n)
{
// Push the element to the free list.
const int pos = n * il->num_fields;
il->data[pos] = il->free_element;
il->free_element = n;
}
Run Code Online (Sandbox Code Playgroud)
Using IntList
Using this data structure to implement everything doesn't yield the prettiest code. Instead of accessing elements and fields like this:
elements[n].field = elements[n].field + 1;
Run Code Online (Sandbox Code Playgroud)
... we end up doing it like this:
il_set(&elements, n, idx_field, il_get(&elements, n, idx_field) + 1);
Run Code Online (Sandbox Code Playgroud)
... which is disgusting, I know, but the point of this code is to be as efficient and portable as possible, not to be as easy to maintain as possible. The hope is that people can just use this quadtree for their projects without changing it or maintaining it.
Oh and feel free to use this code I post however you want, even for commercial projects. I would really love it if people let me know if they find it useful, but do as you wish.
C Quadtree
All right, so using the above data structure, here is the quadtree in C:
#ifndef QUADTREE_H
#define QUADTREE_H
#include "IntList.h"
#ifdef __cplusplus
#define QTREE_FUNC extern "C"
#else
#define QTREE_FUNC
#endif
typedef struct Quadtree Quadtree;
struct Quadtree
{
// Stores all the nodes in the quadtree. The first node in this
// sequence is always the root.
IntList nodes;
// Stores all the elements in the quadtree.
IntList elts;
// Stores all the element nodes in the quadtree.
IntList enodes;
// Stores the quadtree extents.
int root_mx, root_my, root_sx, root_sy;
// Maximum allowed elements in a leaf before the leaf is subdivided/split unless
// the leaf is at the maximum allowed tree depth.
int max_elements;
// Stores the maximum depth allowed for the quadtree.
int max_depth;
// Temporary buffer used for queries.
char* temp;
// Stores the size of the temporary buffer.
int temp_size;
};
// Function signature used for traversing a tree node.
typedef void QtNodeFunc(Quadtree* qt, void* user_data, int node, int depth, int mx, int my, int sx, int sy);
// Creates a quadtree with the requested extents, maximum elements per leaf, and maximum tree depth.
QTREE_FUNC void qt_create(Quadtree* qt, int width, int height, int max_elements, int max_depth);
// Destroys the quadtree.
QTREE_FUNC void qt_destroy(Quadtree* qt);
// Inserts a new element to the tree.
// Returns an index to the new element.
QTREE_FUNC int qt_insert(Quadtree* qt, int id, float x1, float y1, float x2, float y2);
// Removes the specified element from the tree.
QTREE_FUNC void qt_remove(Quadtree* qt, int element);
// Cleans up the tree, removing empty leaves.
QTREE_FUNC void qt_cleanup(Quadtree* qt);
// Outputs a list of elements found in the specified rectangle.
QTREE_FUNC void qt_query(Quadtree* qt, IntList* out, float x1, float y1, float x2, float y2, int omit_element);
// Traverses all the nodes in the tree, calling 'branch' for branch nodes and 'leaf'
// for leaf nodes.
QTREE_FUNC void qt_traverse(Quadtree* qt, void* user_data, QtNodeFunc* branch, QtNodeFunc* leaf);
#endif
#include "Quadtree.h"
#include <stdlib.h>
enum
{
// ----------------------------------------------------------------------------------------
// Element node fields:
// ----------------------------------------------------------------------------------------
enode_num = 2,
// Points to the next element in the leaf node. A value of -1
// indicates the end of the list.
enode_idx_next = 0,
// Stores the element index.
enode_idx_elt = 1,
// ----------------------------------------------------------------------------------------
// Element fields:
// ----------------------------------------------------------------------------------------
elt_num = 5,
// Stores the rectangle encompassing the element.
elt_idx_lft = 0, elt_idx_top = 1, elt_idx_rgt = 2, elt_idx_btm = 3,
// Stores the ID of the element.
elt_idx_id = 4,
// ----------------------------------------------------------------------------------------
// Node fields:
// ----------------------------------------------------------------------------------------
node_num = 2,
// Points to the first child if this node is a branch or the first element
// if this node is a leaf.
node_idx_fc = 0,
// Stores the number of elements in the node or -1 if it is not a leaf.
node_idx_num = 1,
// ----------------------------------------------------------------------------------------
// Node data fields:
// ----------------------------------------------------------------------------------------
nd_num = 6,
// Stores the extents of the node using a centered rectangle and half-size.
nd_idx_mx = 0, nd_idx_my = 1, nd_idx_sx = 2, nd_idx_sy = 3,
// Stores the index of the node.
nd_idx_index = 4,
// Stores the depth of the node.
nd_idx_depth = 5,
};
static void node_insert(Quadtree* qt, int index, int depth, int mx, int my, int sx, int sy, int element);
static int floor_int(float val)
{
return (int)val;
}
static int intersect(int l1, int t1, int r1, int b1,
int l2, int t2, int r2, int b2)
{
return l2 <= r1 && r2 >= l1 && t2 <= b1 && b2 >= t1;
}
void leaf_insert(Quadtree* qt, int node, int depth, int mx, int my, int sx, int sy, int element)
{
// Insert the element node to the leaf.
const int nd_fc = il_get(&qt->nodes, node, node_idx_fc);
il_set(&qt->nodes, node, node_idx_fc, il_insert(&qt->enodes));
il_set(&qt->enodes, il_get(&qt->nodes, node, node_idx_fc), enode_idx_next, nd_fc);
il_set(&qt->enodes, il_get(&qt->nodes, node, node_idx_fc), enode_idx_elt, element);
// If the leaf is full, split it.
if (il_get(&qt->nodes, node, node_idx_num) == qt->max_elements && depth < qt->max_depth)
{
int fc = 0, j = 0;
IntList elts = {0};
il_create(&elts, 1);
// Transfer elements from the leaf node to a list of elements.
while (il_get(&qt->nodes, node, node_idx_fc) != -1)
{
const int index = il_get(&qt->nodes, node, node_idx_fc);
const int next_index = il_get(&qt->enodes, index, enode_idx_next);
const int elt = il_get(&qt->enodes, index, enode_idx_elt);
// Pop off the element node from the leaf and remove it from the qt.
il_set(&qt->nodes, node, node_idx_fc, next_index);
il_erase(&qt->enodes, index);
// Insert element to the list.
il_set(&elts, il_push_back(&elts), 0, elt);
}
// Start by allocating 4 child nodes.
fc = il_insert(&qt->nodes);
il_insert(&qt->nodes);
il_insert(&qt->nodes);
il_insert(&qt->nodes);
il_set(&qt->nodes, node, node_idx_fc, fc);
// Initialize the new child nodes.
for (j=0; j < 4; ++j)
{
il_set(&qt->nodes, fc+j, node_idx_fc, -1);
il_set(&qt->nodes, fc+j, node_idx_num, 0);
}
// Transfer the elements in the former leaf node to its new children.
il_set(&qt->nodes, node, node_idx_num, -1);
for (j=0; j < il_size(&elts); ++j)
node_insert(qt, node, depth, mx, my, sx, sy, il_get(&elts, j, 0));
il_destroy(&elts);
}
else
{
// Increment the leaf element count.
il_set(&qt->nodes, node, node_idx_num, il_get(&qt->nodes, node, node_idx_num) + 1);
}
}
static void push_node(IntList* nodes, int nd_index, int nd_depth, int nd_mx, int nd_my, int nd_sx, int nd_sy)
{
const int back_idx = il_push_back(nodes);
il_set(nodes, back_idx, nd_idx_mx, nd_mx);
il_set(nodes, back_idx, nd_idx_my, nd_my);
il_set(nodes, back_idx, nd_idx_sx, nd_sx);
il_set(nodes, back_idx, nd_idx_sy, nd_sy);
il_set(nodes, back_idx, nd_idx_index, nd_index);
il_set(nodes, back_idx, nd_idx_depth, nd_depth);
}
static void find_leaves(IntList* out, const Quadtree* qt, int node, int depth,
int mx, int my, int sx, int sy,
int lft, int top, int rgt, int btm)
{
IntList to_process = {0};
il_create(&to_process, nd_num);
push_node(&to_process, node, depth, mx, my, sx, sy);
while (il_size(&to_process) > 0)
{
const int back_idx = il_size(&to_process) - 1;
const int nd_mx = il_get(&to_process, back_idx, nd_idx_mx);
const int nd_my = il_get(&to_process, back_idx, nd_idx_my);
const int nd_sx = il_get(&to_process, back_idx, nd_idx_sx);
const int nd_sy = il_get(&to_process, back_idx, nd_idx_sy);
const int nd_index = il_get(&to_process, back_idx, nd_idx_index);
const int nd_depth = il_get(&to_process, back_idx, nd_idx_depth);
il_pop_back(&to_process);
// If this node is a leaf, insert it to the list.
if (il_get(&qt->nodes, nd_index, node_idx_num) != -1)
push_node(out, nd_index, nd_depth, nd_mx, nd_my, nd_sx, nd_sy);
else
{
// Otherwise push the children that intersect the rectangle.
const int fc = il_get(&qt->nodes, nd_index, node_idx_fc);
const int hx = nd_sx >> 1, hy = nd_sy >> 1;
const int l = nd_mx-hx, t = nd_my-hx, r = nd_mx+hx, b = nd_my+hy;
if (top <= nd_my)
{
if (lft <= nd_mx)
push_node(&to_process, fc+0, nd_depth+1, l,t,hx,hy);
if (rgt > nd_mx)
push_node(&to_process, fc+1, nd_depth+1, r,t,hx,hy);
}
if (btm > nd_my)
{
if (lft <= nd_mx)
push_node(&to_process, fc+2, nd_depth+1, l,b,hx,hy);
if (rgt > nd_mx)
push_node(&to_process, fc+3, nd_depth+1, r,b,hx,hy);
}
}
}
il_destroy(&to_process);
}
static void node_insert(Quadtree* qt, int index, int depth, int mx, int my, int sx, int sy, int element)
{
// Find the leaves and insert the element to all the leaves found.
int j = 0;
IntList leaves = {0};
const int lft = il_get(&qt->elts, element, elt_idx_lft);
const int top = il_get(&qt->elts, element, elt_idx_top);
const int rgt = il_get(&qt->elts, element, elt_idx_rgt);
const int btm = il_get(&qt->elts, element, elt_idx_btm);
il_create(&leaves, nd_num);
find_leaves(&leaves, qt, index, depth, mx, my, sx, sy, lft, top, rgt, btm);
for (j=0; j < il_size(&leaves); ++j)
{
const int nd_mx = il_get(&leaves, j, nd_idx_mx);
const int nd_my = il_get(&leaves, j, nd_idx_my);
const int nd_sx = il_get(&leaves, j, nd_idx_sx);
const int nd_sy = il_get(&leaves, j, nd_idx_sy);
const int nd_index = il_get(&leaves, j, nd_idx_index);
const int nd_depth = il_get(&leaves, j, nd_idx_depth);
leaf_insert(qt, nd_index, nd_depth, nd_mx, nd_my, nd_sx, nd_sy, element);
}
il_destroy(&leaves);
}
void qt_create(Quadtree* qt, int width, int height, int max_elements, int max_depth)
{
qt->max_elements = max_elements;
qt->max_depth = max_depth;
qt->temp = 0;
qt->temp_size = 0;
il_create(&qt->nodes, node_num);
il_create(&qt->elts, elt_num);
il_create(&qt->enodes, enode_num);
// Insert the root node to the qt.
il_insert(&qt->nodes);
il_set(&qt->nodes, 0, node_idx_fc, -1);
il_set(&qt->nodes, 0, node_idx_num, 0);
// Set the extents of the root node.
qt->root_mx = width >> 1;
qt->root_my = height >> 1;
qt->root_sx = qt->root_mx;
qt->root_sy = qt->root_my;
}
void qt_destroy(Quadtree* qt)
{
il_destroy(&qt->nodes);
il_destroy(&qt->elts);
il_destroy(&qt->enodes);
free(qt->temp);
}
int qt_insert(Quadtree* qt, int id, float x1, float y1, float x2, float y2)
{
// Insert a new element.
const int new_element = il_insert(&qt->elts);
// Set the fields of the new element.
il_set(&qt->elts, new_element, elt_idx_lft, floor_int(x1));
il_set(&qt->elts, new_element, elt_idx_top, floor_int(y1));
il_set(&qt->elts, new_element, elt_idx_rgt, floor_int(x2));
il_set(&qt->elts, new_element, elt_idx_btm, floor_int(y2));
il_set(&qt->elts, new_element, elt_idx_id, id);
// Insert the element to the appropriate leaf node(s).
node_insert(qt, 0, 0, qt->root_mx, qt->root_my, qt->root_sx, qt->root_sy, new_element);
return new_element;
}
void qt_remove(Quadtree* qt, int element)
{
// Find the leaves.
int j = 0;
IntList leaves = {0};
const int lft = il_get(&qt->elts, element, elt_idx_lft);
const int top = il_get(&qt->elts, element, elt_idx_top);
const int rgt = il_get(&qt->elts, element, elt_idx_rgt);
const int btm = il_get(&qt->elts, element, elt_idx_btm);
il_create(&leaves, nd_num);
find_leaves(&leaves, qt, 0, 0, qt->root_mx, qt->root_my, qt->root_sx, qt->root_sy, lft, top, rgt, btm);
// For each leaf node, remove the element node.
for (j=0; j < il_size(&leaves); ++j)
{
const int nd_index = il_get(&leaves, j, nd_idx_index);
// Walk the list until we find the element node.
int node_index = il_get(&qt->nodes, nd_index, node_idx_fc);
int prev_index = -1;
while (node_index != -1 && il_get(&qt->enodes, node_index, enode_idx_elt) != element)
{
prev_index = node_index;
node_index = il_get(&qt->enodes, node_index, enode_idx_next);
}
if (node_index != -1)
{
// Remove the element node.
const int next_index = il_get(&qt->enodes, node_index, enode_idx_next);
if (prev_index == -1)
il_set(&qt->nodes, nd_index, node_idx_fc, next_index);
else
il_set(&qt->enodes, prev_index, enode_idx_next, next_index);
il_erase(&qt->enodes, node_index);
// Decrement the leaf element count.
il_set(&qt->nodes, nd_index, node_idx_num, il_get(&qt->nodes, nd_index, node_idx_num)-1);
}
}
il_destroy(&leaves);
// Remove the element.
il_erase(&qt->elts, element);
}
void qt_query(Quadtree* qt, IntList* out, float x1, float y1, float x2, float y2, int omit_element)
{
// Find the leaves that intersect the specified query rectangle.
int j = 0;
IntList leaves = {0};
const int elt_cap = il_size(&qt->elts);
const int qlft = floor_int(x1);
const int qtop = floor_int(y1);
const int qrgt = floor_int(x2);
const int qbtm = floor_int(y2);
if (qt->temp_size < elt_cap)
{
qt->temp_size = elt_cap;
qt->temp = realloc(qt->temp, qt->temp_size * sizeof(*qt->temp));
memset(qt->temp, 0, qt->temp_size * sizeof(*qt->temp));
}
// For each leaf node, look for elements that intersect.
il_create(&leaves, nd_num);
find_leaves(&leaves, qt, 0, 0, qt->root_mx, qt->root_my, qt->root_sx, qt->root_sy, qlft, qtop, qrgt, qbtm);
il_clear(out);
for (j=0; j < il_size(&leaves); ++j)
{
const int nd_index = il_get(&leaves, j, nd_idx_index);
// Walk the list and add elements that intersect.
int elt_node_index = il_get(&qt->nodes, nd_index, node_idx_fc);
while (elt_node_index != -1)
{
const int element = il_get(&qt->enodes, elt_node_index, enode_idx_elt);
const int lft = il_get(&qt->elts, element, elt_idx_lft);
const int top = il_get(&qt->elts, element, elt_idx_top);
const int rgt = il_get(&qt->elts, element, elt_idx_rgt);
const int btm = il_get(&qt->elts, element, elt_idx_btm);
if (!qt->temp[element] && element != omit_element && intersect(qlft,qtop,qrgt,qbtm, lft,top,rgt,btm))
{
il_set(out, il_push_back(out), 0, element);
qt->temp[element] = 1;
}
elt_node_index = il_get(&qt->enodes, elt_node_index, enode_idx_next);
}
}
il_destroy(&leaves);
// Unmark the elements that were inserted.
for (j=0; j < il_size(out); ++j)
qt->temp[il_get(out, j, 0)] = 0;
}
void qt_cleanup(Quadtree* qt)
{
IntList to_process = {0};
il_create(&to_process, 1);
// Only process the root if it's not a leaf.
if (il_get(&qt->nodes, 0, node_idx_num) == -1)
{
// Push the root index to the stack.
il_set(&to_process, il_push_back(&to_process), 0, 0);
}
while (il_size(&to_process) > 0)
{
// Pop a node from the stack.
const int node = il_get(&to_process, il_size(&to_process)-1, 0);
const int fc = il_get(&qt->nodes, node, node_idx_fc);
int num_empty_leaves = 0;
int j = 0;
il_pop_back(&to_process);
// Loop through the children.
for (j=0; j < 4; ++j)
{
const int child = fc + j;
// Increment empty leaf count if the child is an empty
// leaf. Otherwise if the child is a branch, add it to
// the stack to be processed in the next iteration.
if (il_get(&qt->nodes, child, node_idx_num) == 0)
++num_empty_leaves;
else if (il_get(&qt->nodes, child, node_idx_num) == -1)
{
// Push the child index to the stack.
il_set(&to_process, il_push_back(&to_process), 0, child);
}
}
// If all the children were empty leaves, remove them and
// make this node the new empty leaf.
if (num_empty_leaves == 4)
{
// Remove all 4 children in reverse order so that they
// can be reclaimed on subsequent insertions in proper
// order.
il_erase(&qt->nodes, fc + 3);
il_erase(&qt->nodes, fc + 2);
il_erase(&qt->nodes, fc + 1);
il_erase(&qt->nodes, fc + 0);
// Make this node the new empty leaf.
il_set(&qt->nodes, node, node_idx_fc, -1);
il_set(&qt->nodes, node, node_idx_num, 0);
}
}
il_destroy(&to_process);
}
void qt_traverse(Quadtree* qt, void* user_data, QtNodeFunc* branch, QtNodeFunc* leaf)
{
IntList to_process = {0};
il_create(&to_process, nd_num);
push_node(&to_process, 0, 0, qt->root_mx, qt->root_my, qt->root_sx, qt->root_sy);
while (il_size(&to_process) > 0)
{
const int back_idx = il_size(&to_process) - 1;
const int nd_mx = il_get(&to_process, back_idx, nd_idx_mx);
const int nd_my = il_get(&to_process, back_idx, nd_idx_my);
const int nd_sx = il_get(&to_process, back_idx, nd_idx_sx);
const int nd_sy = il_get(&to_process, back_idx, nd_idx_sy);
const int nd_index = il_get(&to_process, back_idx, nd_idx_index);
const int nd_depth = il_get(&to_process, back_idx, nd_idx_depth);
const int fc = il_get(&qt->nodes, nd_index, node_idx_fc);
il_pop_back(&to_process);
if (il_get(&qt->nodes, nd_index, node_idx_num) == -1)
{
// Push the children of the branch to the stack.
const int hx = nd_sx >> 1, hy = nd_sy >> 1;
const int l = nd_mx-hx, t = nd_my-hx, r = nd_mx+hx, b = nd_my+hy;
push_node(&to_process, fc+0, nd_depth+1, l,t, hx,hy);
push_node(&to_process, fc+1, nd_depth+1, r,t, hx,hy);
push_node(&to_process, fc+2, nd_depth+1, l,b, hx,hy);
push_node(&to_process, fc+3, nd_depth+1, r,b, hx,hy);
if (branch)
branch(qt, user_data, nd_index, nd_depth, nd_mx, nd_my, nd_sx, nd_sy);
}
else if (leaf)
leaf(qt, user_data, nd_index, nd_depth, nd_mx, nd_my, nd_sx, nd_sy);
}
il_destroy(&to_process);
}
Run Code Online (Sandbox Code Playgroud)
Temporary Conclusion
This isn't such a great answer but I'll try to come back and keep editing it. However, the above code should be very efficient on pretty much any language that allows contiguous arrays of plain old integers. As long as we have that contiguity guarantee, we can come up with a very cache-friendly quadtree which uses a very small mem footprint.
Please refer to the original answer for details on the overall approach.
Dra*_*rgy 12
对于这个答案,我将介绍一个不道德的技巧,如果数据合适(例如,它通常会出现在许多视频游戏中),可以让你的模拟运行速度提高一个数量级.它可以让您从数万到数十万个代理,或数十万代理到数百万代理.到目前为止,我还没有在我的答案中显示的任何演示中应用它,因为它有点作弊,但我已经在生产中使用它,它可以创造一个与众不同的世界.有趣的是,我没有看到它经常讨论.其实我从来没有见过它讨论过哪个很奇怪.
让我们回到指环王的例子.我们有许多"人性化"的单位,如人类,精灵,矮人,兽人和霍比特人,我们也有一些巨大的单位,如龙和蚂蚁.
"人性化"单位的规模差别不大.霍比特人可能有四英尺高,有点矮胖,一个兽人可能是6尺4.虽然存在一些差异,但这并不是一个史诗般的差异.这不是一个数量级的差异.
那么如果我们在一个霍比特人周围放置一个边界球体/盒子会发生什么呢?这个体育场就像一个兽人的边界球体/盒子的大小只是为了粗略的交叉点查询(在我们深入研究在粒度/精细水平上检查更真实的碰撞之前) )?有一点浪费的负空间,但真正有趣的事情发生了.
如果我们可以预见普通案例单位的上限,我们可以将它们存储在一个数据结构中,该数据结构假设所有事物都具有统一的上限大小.在这种情况下会发生几件非常有趣的事情:
存储一点
第二部分很棘手,但想象我们有这样一个案例:
好吧,如果我们查看绿色圆圈并搜索其半径,如果它只存储在我们的空间索引中的单个点,我们最终会错过蓝色圆圈的中心点.但是,如果我们搜索一个两倍于我们圆圈半径的区域呢?
在这种情况下,即使蓝色圆圈仅存储在我们的空间索引中的单个点(橙色的中心点),我们也会找到交叉点.只是为了直观地表明这是有效的:
在这种情况下,圆圈不相交,我们可以看到中心点甚至在扩展的双倍搜索半径之外.因此,只要我们在空间索引中搜索两倍半径,假设所有元素都具有统一的上限大小,如果我们搜索上限半径两倍的区域,我们就可以保证在粗略查询中找到它们(或者AABB的矩形半尺寸的两倍).
现在看起来似乎很浪费,就像检查我们的搜索查询中所需的更多单元/节点一样,但这只是因为我为了说明目的而绘制了图表.如果使用此策略,则将其用于大小通常是单个叶节点/单元大小的一小部分的元素.
巨大的优化
因此,您可以应用的巨大优化是将您的内容分为3种不同的类型:
因此,如果您可以应用它,那么将具有统一上限范围(边界球体或框)的公共区域元素分离的想法可能是非常有用的优化策略.这也是我没有看到的讨论.我经常看到开发人员在谈论分离动态内容和静态内容,但是通过进一步对常见情况类似大小的动态元素进行分组并将它们视为具有统一的上限大小,您可以获得同样多的改进(如果不是更多)粗碰撞测试,它具有允许它们存储为无限小点的效果,该点只插入紧密数据结构中的一个叶节点.
论"作弊"的好处
所以这个解决方案并不是特别聪明或有趣,但它背后的心态是我觉得值得一提的东西,至少对于像我这样的人来说.我浪费了我职业生涯中很大一部分寻找"超级"解决方案:一个适合所有数据结构和算法,可以精美地处理任何用例,希望能够花费额外的时间来获得它是的,然后在未来和不同的用例中疯狂地重用它,更不用说与寻求相同的同事一起工作了.
在性能不能过分牺牲生产力的情况下,热心寻求这样的解决方案既不会导致性能也不会导致生产力.因此,有时最好停下来查看软件特定数据要求的性质,看看我们是否可以"欺骗"并创建一些针对这些特殊要求的"定制",更加狭隘的解决方案,如本例所示.有时,这是一种最有效的方法,可以在不利于另一方的情况下过度妥协的情况下,将性能和生产力完美结合.
我希望人们不介意我发布第三个答案,但我再次跑出了角色限制.我结束了在Java的第二个答案中移植C代码.对于移植到面向对象语言的人来说,Java端口可能更容易引用.
class IntList
{
private int data[] = new int[128];
private int num_fields = 0;
private int num = 0;
private int cap = 128;
private int free_element = -1;
// Creates a new list of elements which each consist of integer fields.
// 'start_num_fields' specifies the number of integer fields each element has.
public IntList(int start_num_fields)
{
num_fields = start_num_fields;
}
// Returns the number of elements in the list.
int size()
{
return num;
}
// Returns the value of the specified field for the nth element.
int get(int n, int field)
{
assert n >= 0 && n < num && field >= 0 && field < num_fields;
return data[n*num_fields + field];
}
// Sets the value of the specified field for the nth element.
void set(int n, int field, int val)
{
assert n >= 0 && n < num && field >= 0 && field < num_fields;
data[n*num_fields + field] = val;
}
// Clears the list, making it empty.
void clear()
{
num = 0;
free_element = -1;
}
// Inserts an element to the back of the list and returns an index to it.
int pushBack()
{
final int new_pos = (num+1) * num_fields;
// If the list is full, we need to reallocate the buffer to make room
// for the new element.
if (new_pos > cap)
{
// Use double the size for the new capacity.
final int new_cap = new_pos * 2;
// Allocate new array and copy former contents.
int new_array[] = new int[new_cap];
System.arraycopy(data, 0, new_array, 0, cap);
data = new_array;
// Set the old capacity to the new capacity.
cap = new_cap;
}
return num++;
}
// Removes the element at the back of the list.
void popBack()
{
// Just decrement the list size.
assert num > 0;
--num;
}
// Inserts an element to a vacant position in the list and returns an index to it.
int insert()
{
// If there's a free index in the free list, pop that and use it.
if (free_element != -1)
{
final int index = free_element;
final int pos = index * num_fields;
// Set the free index to the next free index.
free_element = data[pos];
// Return the free index.
return index;
}
// Otherwise insert to the back of the array.
return pushBack();
}
// Removes the nth element in the list.
void erase(int n)
{
// Push the element to the free list.
final int pos = n * num_fields;
data[pos] = free_element;
free_element = n;
}
}
Run Code Online (Sandbox Code Playgroud)
Java四叉树
这里是Java中的四叉树(对不起,如果它不是非常惯用的;我在大约十年左右的时间里没有写过Java并忘记了很多东西):
interface IQtVisitor
{
// Called when traversing a branch node.
// (mx, my) indicate the center of the node's AABB.
// (sx, sy) indicate the half-size of the node's AABB.
void branch(Quadtree qt, int node, int depth, int mx, int my, int sx, int sy);
// Called when traversing a leaf node.
// (mx, my) indicate the center of the node's AABB.
// (sx, sy) indicate the half-size of the node's AABB.
void leaf(Quadtree qt, int node, int depth, int mx, int my, int sx, int sy);
}
class Quadtree
{
// Creates a quadtree with the requested extents, maximum elements per leaf, and maximum tree depth.
Quadtree(int width, int height, int start_max_elements, int start_max_depth)
{
max_elements = start_max_elements;
max_depth = start_max_depth;
// Insert the root node to the qt.
nodes.insert();
nodes.set(0, node_idx_fc, -1);
nodes.set(0, node_idx_num, 0);
// Set the extents of the root node.
root_mx = width / 2;
root_my = height / 2;
root_sx = root_mx;
root_sy = root_my;
}
// Outputs a list of elements found in the specified rectangle.
public int insert(int id, float x1, float y1, float x2, float y2)
{
// Insert a new element.
final int new_element = elts.insert();
// Set the fields of the new element.
elts.set(new_element, elt_idx_lft, floor_int(x1));
elts.set(new_element, elt_idx_top, floor_int(y1));
elts.set(new_element, elt_idx_rgt, floor_int(x2));
elts.set(new_element, elt_idx_btm, floor_int(y2));
elts.set(new_element, elt_idx_id, id);
// Insert the element to the appropriate leaf node(s).
node_insert(0, 0, root_mx, root_my, root_sx, root_sy, new_element);
return new_element;
}
// Removes the specified element from the tree.
public void remove(int element)
{
// Find the leaves.
final int lft = elts.get(element, elt_idx_lft);
final int top = elts.get(element, elt_idx_top);
final int rgt = elts.get(element, elt_idx_rgt);
final int btm = elts.get(element, elt_idx_btm);
IntList leaves = find_leaves(0, 0, root_mx, root_my, root_sx, root_sy, lft, top, rgt, btm);
// For each leaf node, remove the element node.
for (int j=0; j < leaves.size(); ++j)
{
final int nd_index = leaves.get(j, nd_idx_index);
// Walk the list until we find the element node.
int node_index = nodes.get(nd_index, node_idx_fc);
int prev_index = -1;
while (node_index != -1 && enodes.get(node_index, enode_idx_elt) != element)
{
prev_index = node_index;
node_index = enodes.get(node_index, enode_idx_next);
}
if (node_index != -1)
{
// Remove the element node.
final int next_index = enodes.get(node_index, enode_idx_next);
if (prev_index == -1)
nodes.set(nd_index, node_idx_fc, next_index);
else
enodes.set(prev_index, enode_idx_next, next_index);
enodes.erase(node_index);
// Decrement the leaf element count.
nodes.set(nd_index, node_idx_num, nodes.get(nd_index, node_idx_num)-1);
}
}
// Remove the element.
elts.erase(element);
}
// Cleans up the tree, removing empty leaves.
public void cleanup()
{
IntList to_process = new IntList(1);
// Only process the root if it's not a leaf.
if (nodes.get(0, node_idx_num) == -1)
{
// Push the root index to the stack.
to_process.set(to_process.pushBack(), 0, 0);
}
while (to_process.size() > 0)
{
// Pop a node from the stack.
final int node = to_process.get(to_process.size()-1, 0);
final int fc = nodes.get(node, node_idx_fc);
int num_empty_leaves = 0;
to_process.popBack();
// Loop through the children.
for (int j=0; j < 4; ++j)
{
final int child = fc + j;
// Increment empty leaf count if the child is an empty
// leaf. Otherwise if the child is a branch, add it to
// the stack to be processed in the next iteration.
if (nodes.get(child, node_idx_num) == 0)
++num_empty_leaves;
else if (nodes.get(child, node_idx_num) == -1)
{
// Push the child index to the stack.
to_process.set(to_process.pushBack(), 0, child);
}
}
// If all the children were empty leaves, remove them and
// make this node the new empty leaf.
if (num_empty_leaves == 4)
{
// Remove all 4 children in reverse order so that they
// can be reclaimed on subsequent insertions in proper
// order.
nodes.erase(fc + 3);
nodes.erase(fc + 2);
nodes.erase(fc + 1);
nodes.erase(fc + 0);
// Make this node the new empty leaf.
nodes.set(node, node_idx_fc, -1);
nodes.set(node, node_idx_num, 0);
}
}
}
// Returns a list of elements found in the specified rectangle.
public IntList query(float x1, float y1, float x2, float y2)
{
return query(x1, y1, x2, y2, -1);
}
// Returns a list of elements found in the specified rectangle excluding the
// specified element to omit.
public IntList query(float x1, float y1, float x2, float y2, int omit_element)
{
IntList out = new IntList(1);
// Find the leaves that intersect the specified query rectangle.
final int qlft = floor_int(x1);
final int qtop = floor_int(y1);
final int qrgt = floor_int(x2);
final int qbtm = floor_int(y2);
IntList leaves = find_leaves(0, 0, root_mx, root_my, root_sx, root_sy, qlft, qtop, qrgt, qbtm);
if (temp_size < elts.size())
{
temp_size = elts.size();
temp = new boolean[temp_size];;
}
// For each leaf node, look for elements that intersect.
for (int j=0; j < leaves.size(); ++j)
{
final int nd_index = leaves.get(j, nd_idx_index);
// Walk the list and add elements that intersect.
int elt_node_index = nodes.get(nd_index, node_idx_fc);
while (elt_node_index != -1)
{
final int element = enodes.get(elt_node_index, enode_idx_elt);
final int lft = elts.get(element, elt_idx_lft);
final int top = elts.get(element, elt_idx_top);
final int rgt = elts.get(element, elt_idx_rgt);
final int btm = elts.get(element, elt_idx_btm);
if (!temp[element] && element != omit_element && intersect(qlft,qtop,qrgt,qbtm, lft,top,rgt,btm))
{
out.set(out.pushBack(), 0, element);
temp[element] = true;
}
elt_node_index = enodes.get(elt_node_index, enode_idx_next);
}
}
// Unmark the elements that were inserted.
for (int j=0; j < out.size(); ++j)
temp[out.get(j, 0)] = false;
return out;
}
// Traverses all the nodes in the tree, calling 'branch' for branch nodes and 'leaf'
// for leaf nodes.
public void traverse(IQtVisitor visitor)
{
IntList to_process = new IntList(nd_num);
pushNode(to_process, 0, 0, root_mx, root_my, root_sx, root_sy);
while (to_process.size() > 0)
{
final int back_idx = to_process.size() - 1;
final int nd_mx = to_process.get(back_idx, nd_idx_mx);
final int nd_my = to_process.get(back_idx, nd_idx_my);
final int nd_sx = to_process.get(back_idx, nd_idx_sx);
final int nd_sy = to_process.get(back_idx, nd_idx_sy);
final int nd_index = to_process.get(back_idx, nd_idx_index);
final int nd_depth = to_process.get(back_idx, nd_idx_depth);
final int fc = nodes.get(nd_index, node_idx_fc);
to_process.popBack();
if (nodes.get(nd_index, node_idx_num) == -1)
{
// Push the children of the branch to the stack.
final int hx = nd_sx >> 1, hy = nd_sy >> 1;
final int l = nd_mx-hx, t = nd_my-hx, r = nd_mx+hx, b = nd_my+hy;
pushNode(to_process, fc+0, nd_depth+1, l,t, hx,hy);
pushNode(to_process, fc+1, nd_depth+1, r,t, hx,hy);
pushNode(to_process, fc+2, nd_depth+1, l,b, hx,hy);
pushNode(to_process, fc+3, nd_depth+1, r,b, hx,hy);
visitor.branch(this, nd_index, nd_depth, nd_mx, nd_my, nd_sx, nd_sy);
}
else
visitor.leaf(this, nd_index, nd_depth, nd_mx, nd_my, nd_sx, nd_sy);
}
}
private static int floor_int(float val)
{
return (int)val;
}
private static boolean intersect(int l1, int t1, int r1, int b1,
int l2, int t2, int r2, int b2)
{
return l2 <= r1 && r2 >= l1 && t2 <= b1 && b2 >= t1;
}
private static void pushNode(IntList nodes, int nd_index, int nd_depth, int nd_mx, int nd_my, int nd_sx, int nd_sy)
{
final int back_idx = nodes.pushBack();
nodes.set(back_idx, nd_idx_mx, nd_mx);
nodes.set(back_idx, nd_idx_my, nd_my);
nodes.set(back_idx, nd_idx_sx, nd_sx);
nodes.set(back_idx, nd_idx_sy, nd_sy);
nodes.set(back_idx, nd_idx_index, nd_index);
nodes.set(back_idx, nd_idx_depth, nd_depth);
}
private IntList find_leaves(int node, int depth,
int mx, int my, int sx, int sy,
int lft, int top, int rgt, int btm)
{
IntList leaves = new IntList(nd_num);
IntList to_process = new IntList(nd_num);
pushNode(to_process, node, depth, mx, my, sx, sy);
while (to_process.size() > 0)
{
final int back_idx = to_process.size() - 1;
final int nd_mx = to_process.get(back_idx, nd_idx_mx);
final int nd_my = to_process.get(back_idx, nd_idx_my);
final int nd_sx = to_process.get(back_idx, nd_idx_sx);
final int nd_sy = to_process.get(back_idx, nd_idx_sy);
final int nd_index = to_process.get(back_idx, nd_idx_index);
final int nd_depth = to_process.get(back_idx, nd_idx_depth);
to_process.popBack();
// If this node is a leaf, insert it to the list.
if (nodes.get(nd_index, node_idx_num) != -1)
pushNode(leaves, nd_index, nd_depth, nd_mx, nd_my, nd_sx, nd_sy);
else
{
// Otherwise push the children that intersect the rectangle.
final int fc = nodes.get(nd_index, node_idx_fc);
final int hx = nd_sx / 2, hy = nd_sy / 2;
final int l = nd_mx-hx, t = nd_my-hx, r = nd_mx+hx, b = nd_my+hy;
if (top <= nd_my)
{
if (lft <= nd_mx)
pushNode(to_process, fc+0, nd_depth+1, l,t,hx,hy);
if (rgt > nd_mx)
pushNode(to_process, fc+1, nd_depth+1, r,t,hx,hy);
}
if (btm > nd_my)
{
if (lft <= nd_mx)
pushNode(to_process, fc+2, nd_depth+1, l,b,hx,hy);
if (rgt > nd_mx)
pushNode(to_process, fc+3, nd_depth+1, r,b,hx,hy);
}
}
}
return leaves;
}
private void node_insert(int index, int depth, int mx, int my, int sx, int sy, int element)
{
// Find the leaves and insert the element to all the leaves found.
final int lft = elts.get(element, elt_idx_lft);
final int top = elts.get(element, elt_idx_top);
final int rgt = elts.get(element, elt_idx_rgt);
final int btm = elts.get(element, elt_idx_btm);
IntList leaves = find_leaves(index, depth, mx, my, sx, sy, lft, top, rgt, btm);
for (int j=0; j < leaves.size(); ++j)
{
final int nd_mx = leaves.get(j, nd_idx_mx);
final int nd_my = leaves.get(j, nd_idx_my);
final int nd_sx = leaves.get(j, nd_idx_sx);
final int nd_sy = leaves.get(j, nd_idx_sy);
final int nd_index = leaves.get(j, nd_idx_index);
final int nd_depth = leaves.get(j, nd_idx_depth);
leaf_insert(nd_index, nd_depth, nd_mx, nd_my, nd_sx, nd_sy, element);
}
}
private void leaf_insert(int node, int depth, int mx, int my, int sx, int sy, int element)
{
// Insert the element node to the leaf.
final int nd_fc = nodes.get(node, node_idx_fc);
nodes.set(node, node_idx_fc, enodes.insert());
enodes.set(nodes.get(node, node_idx_fc), enode_idx_next, nd_fc);
enodes.set(nodes.get(node, node_idx_fc), enode_idx_elt, element);
// If the leaf is full, split it.
if (nodes.get(node, node_idx_num) == max_elements && depth < max_depth)
{
// Transfer elements from the leaf node to a list of elements.
IntList elts = new IntList(1);
while (nodes.get(node, node_idx_fc) != -1)
{
final int index = nodes.get(node, node_idx_fc);
final int next_index = enodes.get(index, enode_idx_next);
final int elt = enodes.get(index, enode_idx_elt);
// Pop off the element node from the leaf and remove it from the qt.
nodes.set(node, node_idx_fc, next_index);
enodes.erase(index);
// Insert element to the list.
elts.set(elts.pushBack(), 0, elt);
}
// Start by allocating 4 child nodes.
final int fc = nodes.insert();
nodes.insert();
nodes.insert();
nodes.insert();
nodes.set(node, node_idx_fc, fc);
// Initialize the new child nodes.
for (int j=0; j < 4; ++j)
{
nodes.set(fc+j, node_idx_fc, -1);
nodes.set(fc+j, node_idx_num, 0);
}
// Transfer the elements in the former leaf node to its new children.
nodes.set(node, node_idx_num, -1);
for (int j=0; j < elts.size(); ++j)
node_insert(node, depth, mx, my, sx, sy, elts.get(j, 0));
}
else
{
// Increment the leaf element count.
nodes.set(node, node_idx_num, nodes.get(node, node_idx_num) + 1);
}
}
// ----------------------------------------------------------------------------------------
// Element node fields:
// ----------------------------------------------------------------------------------------
// Points to the next element in the leaf node. A value of -1
// indicates the end of the list.
static final int enode_idx_next = 0;
// Stores the element index.
static final int enode_idx_elt = 1;
// Stores all the element nodes in the quadtree.
private IntList enodes = new IntList(2);
// ----------------------------------------------------------------------------------------
// Element fields:
// ----------------------------------------------------------------------------------------
// Stores the rectangle encompassing the element.
static final int elt_idx_lft = 0, elt_idx_top = 1, elt_idx_rgt = 2, elt_idx_btm = 3;
// Stores the ID of the element.
static final int elt_idx_id = 4;
// Stores all the elements in the quadtree.
private IntList elts = new IntList(5);
// ----------------------------------------------------------------------------------------
// Node fields:
// ----------------------------------------------------------------------------------------
// Points to the first child if this node is a branch or the first element
// if this node is a leaf.
static final int node_idx_fc = 0;
// Stores the number of elements in the node or -1 if it is not a leaf.
static final int node_idx_num = 1;
// Stores all the nodes in the quadtree. The first node in this
// sequence is always the root.
private IntList nodes = new IntList(2);
// ----------------------------------------------------------------------------------------
// Node data fields:
// ----------------------------------------------------------------------------------------
static final int nd_num = 6;
// Stores the extents of the node using a centered rectangle and half-size.
static final int nd_idx_mx = 0, nd_idx_my = 1, nd_idx_sx = 2, nd_idx_sy = 3;
// Stores the index of the node.
static final int nd_idx_index = 4;
// Stores the depth of the node.
static final int nd_idx_depth = 5;
// ----------------------------------------------------------------------------------------
// Data Members
// ----------------------------------------------------------------------------------------
// Temporary buffer used for queries.
private boolean temp[];
// Stores the size of the temporary buffer.
private int temp_size = 0;
// Stores the quadtree extents.
private int root_mx, root_my, root_sx, root_sy;
// Maximum allowed elements in a leaf before the leaf is subdivided/split unless
// the leaf is at the maximum allowed tree depth.
private int max_elements;
// Stores the maximum depth allowed for the quadtree.
private int max_depth;
}
Run Code Online (Sandbox Code Playgroud)
临时结论
再说一遍,这是一个代码转储答案.我会回来编辑它并尝试解释越来越多的东西.
有关整体方法的详细信息,请参阅原始答案.