jcz*_*lew 6 postgresql postgresql-9.3
这与关于连续范围的这个问题非常相似,以及这个关于按顺序数字分组的问题,但不同之处在于序列不是数字。给定以下关系作为密钥对
a -- b -- c e -- f -- g
| /
| /
d
Run Code Online (Sandbox Code Playgroud)
这是带有示例数据的表(也在SQLFiddle 上):
CREATE TABLE relationships (
name varchar(1),
related varchar(1)
);
INSERT INTO relationships (name, related) VALUES
('a', 'a'),
('a', 'b'),
('b', 'b'),
('b', 'a'),
('b', 'c'),
('b', 'd'),
('c', 'c'),
('c', 'b'),
('c', 'd'),
('d', 'd'),
('d', 'c'),
('d', 'b'),
('e', 'e'),
('e', 'f'),
('f', 'f'),
('f', 'e'),
('f', 'g'),
('g', 'g');
Run Code Online (Sandbox Code Playgroud)
产生如下输出的最有效方法是什么:
| group | members |
------------------------|
| 1 | {a, b, c, d} |
| 2 | {e, f, g} |
Run Code Online (Sandbox Code Playgroud)
或这个:
| name | group |
-----------------
| a | 1 |
| b | 1 |
| c | 1 |
| d | 1 |
| e | 2 |
| f | 2 |
| g | 2 |
Run Code Online (Sandbox Code Playgroud)
我曾考虑过在 Postgres 之外进行此操作,但似乎必须有一种方法可以使用窗口函数或 PL/pgSQL 来实现此结果。
丑陋,但有效!
首先array_agg_mult
从这个问题来定义
CREATE AGGREGATE array_agg_mult (anyarray) (
SFUNC = array_cat
,STYPE = anyarray
,INITCOND = '{}'
);
Run Code Online (Sandbox Code Playgroud)
然后运行查询
WITH summary AS (
SELECT name, array_agg(related) AS touches
FROM relationships
GROUP BY name
),
grouped AS (
SELECT name, (
SELECT array_agg(uniques) FROM (
select distinct unnest(array_agg_mult(sub.touches)) AS uniques
ORDER BY uniques
) x
) my_group
FROM summary LEFT JOIN LATERAL (
SELECT touches
FROM summary r
WHERE summary.touches && r.touches
GROUP BY name, touches
) sub ON true
GROUP BY summary.name
ORDER BY summary.name
)
SELECT DISTINCT my_group, row_number() over() as group_id
FROM grouped
GROUP BY my_group;
Run Code Online (Sandbox Code Playgroud)
产生以下结果:
| my_group | group_id |
| {a,b,c,d} | 2 |
| {e,f,g} | 1 |
Run Code Online (Sandbox Code Playgroud)
SQLFiddle 在这里 - http://sqlfiddle.com/#!15/c8a5b/20。我是此类查询的新手,所以请让我知道是否有更有效的方法来执行此操作!
归档时间: |
|
查看次数: |
1374 次 |
最近记录: |