Pat*_*ick 6 python sql tree hierarchical-data transitive-closure-table
我正在尝试向应用程序添加类似reddit的注释,我决定使用闭包表模式进行数据库组织.我的app数据库看起来有点像这样:
+----+-------+
| id | title |
+----+-------+
| 1 | Hello |
+----+-------+
Run Code Online (Sandbox Code Playgroud)
+----+-----------+---------+------+
| id | parent_id | post_id | text |
+----+-----------+---------+------+
| 1 | NULL | 1 | ... |
| 2 | 1 | 1 | ... |
| 3 | 2 | 1 | ... |
| 4 | 3 | 1 | ... |
| 5 | 3 | 1 | ... |
| 6 | 5 | 1 | ... |
| 7 | NULL | 1 | ... |
| 8 | 7 | 1 | ... |
| 9 | 4 | 1 | ... |
+----+-----------+---------+------+
Run Code Online (Sandbox Code Playgroud)
+-----------+----------+-------+
| parent_id | child_id | depth |
+-----------+----------+-------+
| 1 | 1 | 0 |
| 2 | 2 | 0 |
| 1 | 2 | 1 |
| 3 | 3 | 0 |
| 2 | 3 | 1 |
| 1 | 3 | 2 |
| 4 | 4 | 0 |
| 3 | 4 | 1 |
| 2 | 4 | 2 |
| 1 | 4 | 3 |
[...snip...]
Run Code Online (Sandbox Code Playgroud)
现在,我正在运行此查询:
SELECT c.*, p.*
FROM comments AS c
JOIN comment_paths AS p
ON c.id = p.child_id
WHERE p.parent_id IN
(SELECT c2.id FROM comments AS c2 WHERE c2.parent_id IS NULL AND c2.post_id = 1)
Run Code Online (Sandbox Code Playgroud)
获取基于他们的评论列表post_id.返回的数据是:
+------+-------------+-----------+--------+-------------+------------+---------+
| c.id | c.parent_id | c.post_id | c.text | p.parent_id | p.child_id | p.depth |
+------+-------------+-----------+--------+-------------+------------+---------+
| 1 | NULL | 1 | ... | 1 | 1 | 0 |
| 2 | 1 | 1 | ... | 1 | 2 | 1 |
| 3 | 2 | 1 | ... | 1 | 3 | 2 |
| 4 | 3 | 1 | ... | 1 | 4 | 3 |
| 5 | 3 | 1 | ... | 1 | 5 | 3 |
| 6 | 5 | 1 | ... | 1 | 6 | 4 |
| 9 | 4 | 1 | ... | 1 | 9 | 4 |
| 7 | NULL | 1 | ... | 7 | 7 | 0 |
| 8 | 7 | 1 | ... | 7 | 8 | 1 |
+------+-------------+-----------+--------+-------------+------------+---------+
Run Code Online (Sandbox Code Playgroud)
这代表树:
[1]
|[2]
| |[3]
| |[4]
| | |[9]
| [5]
| |[6]
[7]
|[8]
Run Code Online (Sandbox Code Playgroud)
但是,我正在努力将返回的数据转换为Python树结构.从本质上讲,我的目标是这个问题,这个问题就最终输出(HTML)而言,但我真的不想求助于递归的SQL语句,因为我已经有了这些信息.我认为某种递归是必要的,因为我想得到类似于此的结构:
[
{
'id': 1,
...
'children': [
{
'id': 2,
...
'children': [ ... ]
}
]
},
{
'id': 7,
...
'children': [
{
'id': 8,
...
}
]
}
]
Run Code Online (Sandbox Code Playgroud)
基本上是一个嵌套的字典列表,所以我可以使用Jinja的递归循环遍历它们.有没有人有想法?
谢谢!
搞乱,我有一个"工作"的解决方案,虽然它做了很多迭代,所以我不想把它标记为这个问题的答案.我使用的解决方案是:
comment_set = ... # previous query to grab data set
def create_tree(parent):
parent['children'] = []
for record in comment_set:
if record['parent_id'] == parent['id']:
parent['children'].append(create_tree(record))
return parent
comment_tree = []
for record in comment_set:
if record['parent_id'] is None: # if this is the start of a tree
comment_tree.append(create_tree(record))
Run Code Online (Sandbox Code Playgroud)
它并不理想,因为它会在comment_set每次create_tree()调用时进行迭代,这是集合中的每个记录.但是,这是我现在最好的.有人有什么想法?
您不需要递归,只需要能够通过引用处理节点对象.
这是一个在线性时间内创建嵌套数据结构的代码示例.将其视为伪代码,因为我没有测试过这个,但我还没有精通python.
我使用两个for循环的原因是否则我们必须假设树顶部的节点在树中深处的节点之前处理.如下所示有两个循环,不需要这样的假设.
for record in comment_set:
nodes[record['id']] = record
for record in comment_set:
if record['parent_id'] in nodes:
nodes[record['parent_id']]['children'].append(record)
else
top = record;
return top
Run Code Online (Sandbox Code Playgroud)
到那个循环结束时:
nodes 将是层次结构中所有节点的一维字典.top 应该是唯一没有父节点的节点(即根节点).这类似于我在过去的SO帖子中发布的示例,将数据库结果转换为数组: