在 SQL 中实现不相交集逼近（并集查找）

Question

在 SQL 中实现不相交集逼近（并集查找）

sup*_*pyo 6 sql postgresql graph-theory

使用 SQL 实现近似不相交集的最佳方法是什么？

细节

我有一个边表，存储为[vertex_a, vertex_b].

我需要一个不同集合的表，存储为[vertex, set_id]每个顶点一行，并用不相交的 set_id 标记每个顶点。

约束条件

必须是纯 SQL 实现。它可以利用 Postgres 特定的函数，但高度首选纯 ANSI SQL。
结果可以是近似的——当几个集合实际连接时将它们标记为不相交是可以接受的。如果可以调整近似边界，例如通过增加迭代次数，那就更好了。
库已经退出（没有 Boost、Numpy、Scipy）。必须是SQL。
大多数集合将包含 1 到 3 个顶点。很少有大集合，预计最多为 10 个顶点。

有关的

相关主题：在 C++ 中实现不相交集（并集查找）
这将是不相交集（并集查找）的近似实现- 维基百科

Answer 1

小智 1

我实际上正在解决同样的问题。不幸的是，我认为无法找到非常有效的解决方案 - 至少仅使用 SQL 是不容易的。只需删除“不同”和自消除查询即可观察工作集的大小。也就是说，以下解决方案将起作用。

drop table if exists foobar;
drop function if exists addset(v int[], a int);
/* our vertices table */
create table foobar (
   src int,
   dst int
);

/* Create a small function to treat an array like a set, 
   not strictly necessary but convenient */
create function addset(v int[], a int) returns int[]
as $$
begin
    return (select array_agg(e order by e) 
                   from (select unnest(v || a) as e) f);
end
$$ language plpgsql;

/* fill our table with vertices, note the ordering of each value */
insert into foobar (src, dst) 
     values (1,2), (1,3), (2,3),  (3,4), (4,5), (6,7), (6,8);
/* use a recursive query to extend the sets */
with recursive foo_union (v) as (
    select array[src, dst] as v from foobar /* starter sets */
    union all
    /* join self to original array; i can use a CTE as a 'starter',
       but that is not necessary here */
    select addset(v, dst) as v from foo_union u, foobar f
        where src = any(v) and not dst = any(v)
 ) select distinct v from foo_union a where not exists (
     /* eliminate the many overlapping results */
     select * from foo_union b where b.v @> a.v and b.v != a.v
 );

Run Code Online (Sandbox Code Playgroud)

但同样，这对于较大的数据集是完全不切实际的；任何其他解决方案都需要迭代。

归档时间：	10 年，5 月前
查看次数：	2547 次
最近记录：	2 年，8 月前