Ali*_*ani 3 postgresql performance view partitioning subquery query-performance
使用 PostgreSQL 9.1,我们在 PostgreSQL 的 VIEW 上执行查询时遇到问题。以下是情况:
我们有一个分区表“buz_scdr”,我们在它上面构建了一个视图“Swiss_client_wise_minutes_and_profit”。此 VIEW 的目的是连接来自不同表(包括“buz_scdr”表)的数据以进行高效查询。这个策略一直运行良好,直到表“buz_scdr”变得巨大(所有分区中的整体记录变得巨大。该表基于日期进行分区)。
在此 VIEW 上执行的查询开始需要很长时间(大约 5 到 10 分钟)。为了弄清楚为什么这个查询需要这么长时间才能执行,我们使用 EXPLAIN 命令来显示它的执行计划。我们使用的查询如下:
EXPLAIN SELECT * from "Swiss_client_wise_minutes_and_profit" where start_time = '2012-7-22 08:00';
Run Code Online (Sandbox Code Playgroud)
其结果在explain.depesz.com 上或如下:
Subquery Scan on "Swiss_client_wise_minutes_and_profit" (cost=2127919.71..94874537.55 rows=40474 width=677)
Filter: ("Swiss_client_wise_minutes_and_profit".start_time = '2012-07-22 08:00:00+00'::timestamp with time zone)
-> WindowAgg (cost=2127919.71..94773352.06 rows=8094839 width=148)
-> Sort (cost=2127919.71..2148156.81 rows=8094839 width=148)
Sort Key: cc.name, rdga.group_id
-> Hash Left Join (cost=1661.50..604234.77 rows=8094839 width=148)
Hash Cond: (((cc.company_id)::text = (rdga.company_id)::text) AND ((cs.c_prefix_id)::text = (rdga.dest_id)::text))
-> Hash Left Join (cost=7.88..460615.39 rows=8094839 width=123)
Hash Cond: ((cs.client_name_id)::text = (cc."Alias_name")::text)
-> Append (cost=0.00..349303.48 rows=8094839 width=111)
-> Seq Scan on "Swiss_buz_scdr" cs (cost=0.00..1.06 rows=1 width=610)
Filter: ((customer_name)::text = 'SSP Root'::text)
-> Seq Scan on scdr_buz__2012_07_11 cs (cost=0.00..349302.41 rows=8094838 width=111)
Filter: ((customer_name)::text = 'SSP Root'::text)
-> Hash (cost=5.17..5.17 rows=217 width=24)
-> Seq Scan on "Corporate_companyalias" cc (cost=0.00..5.17 rows=217 width=24)
-> Hash (cost=1334.42..1334.42 rows=21280 width=50)
-> Hash Join (cost=169.56..1334.42 rows=21280 width=50)
Hash Cond: ((rdga.company_id)::text = (c.name)::text)
-> Hash Join (cost=162.68..1034.93 rows=21280 width=50)
Hash Cond: ((rdga.group_id)::text = (rdg.name)::text)
-> Seq Scan on "RateManagement_destgroupassign" rdga (cost=0.00..497.35 rows=25935 width=40)
-> Hash (cost=123.64..123.64 rows=3123 width=32)
-> Hash Join (cost=13.08..123.64 rows=3123 width=32)
Hash Cond: (rdg.country_id = cc.id)
-> Seq Scan on "RateManagement_destinationgroup" rdg (cost=0.00..65.06 rows=3806 width=26)
-> Hash (cost=7.48..7.48 rows=448 width=14)
-> Seq Scan on "Corporate_country" cc (cost=0.00..7.48 rows=448 width=14)
-> Hash (cost=4.17..4.17 rows=217 width=16)
-> Seq Scan on "Corporate_company" c (cost=0.00..4.17 rows=217 width=16)
SubPlan 1
-> Seq Scan on "Corporate_companyalias" cc (cost=0.00..5.71 rows=1 width=12)
Filter: (("Alias_name")::text = (cs.client_name_id)::text)
SubPlan 2
-> Seq Scan on "Corporate_companyalias" cc (cost=0.00..5.71 rows=1 width=12)
Filter: (("Alias_name")::text = (cs.vendor_name_id)::text)
(36 rows)
Run Code Online (Sandbox Code Playgroud)
EXPLAIN 命令的上述结果显示我们的查询正在按顺序扫描“buz_scdr”表(如上所示),其中包含总共 8094838 条记录。VIEW 上的查询未遵循“buz_scdr”的分区约束(日期),这导致它扫描整个表。
出于实验目的,我们使用 WHERE 语句以及 date 和 time 直接在“buz_scdr”表上执行查询,它适当地遵守分区约束并且没有扫描整个表。这表明直接在分区表上执行的查询按预期工作,但基于它构建的 VIEW 有问题。
这是 PostgreSQL 视图的全局问题还是我错过了什么?
编辑:以下是视图“Swiss_client_wise_minutes_and_profit”的 DDL
CREATE VIEW "Swiss_client_wise_minutes_and_profit"
AS SELECT ROW_NUMBER() OVER (ORDER BY rp.country, rp.destination)
As id, (SELECT company_id FROM "Corporate_companyalias"
AS cc WHERE cc."Alias_name" = client_name_id)
AS client_name, (SELECT company_id FROM "Corporate_companyalias"
AS cc WHERE cc."Alias_name" = vendor_name_id) AS vendor_name, cs.c_prefix_id
AS c_prefix, cs.v_prefix_id AS v_prefix, rp.country, rp.destination, cs.c_total_calls, cs.v_total_calls, cs.successful_calls, cs.billed_duration, cs.v_billed_amount AS cost, cs.c_billed_amount
AS revenue, cs.c_pdd AS pdd, cs.profit, cs.start_time, cs.end_time, cs.switch_name FROM "Swiss_buz_scdr"
AS cs LEFT JOIN "Corporate_companyalias" AS cc ON cs.client_name_id = cc."Alias_name" LEFT JOIN "RateManagement_prefix_and_client_wise_destinationgroup"
AS rp ON rp.client_name = cc.company_id AND rp.prefix = cs.c_prefix_id WHERE cs.customer_name = 'SSP Root';
Run Code Online (Sandbox Code Playgroud)
编辑 2:这是“EXPLAIN ANALYZE”命令输出的链接:
它有助于正确格式化查询以查看发生了什么。我研究了您的查询并发现了可疑的 SQL:
CREATE VIEW "Swiss_client_wise_minutes_and_profit" AS
SELECT ROW_NUMBER() OVER (ORDER BY rp.country, rp.destination) AS id
, (SELECT company_id
FROM "Corporate_companyalias" AS cc
WHERE cc."Alias_name" = client_name_id) AS client_name
, (SELECT company_id
FROM "Corporate_companyalias" AS cc
WHERE cc."Alias_name" = vendor_name_id) AS vendor_name
, cs.c_prefix_id AS c_prefix
, cs.v_prefix_id AS v_prefix
, rp.country
, rp.destination
, cs.c_total_calls
, cs.v_total_calls
, cs.successful_calls
, cs.billed_duration
, cs.v_billed_amount AS cost
, cs.c_billed_amount AS revenue
, cs.c_pdd AS pdd
, cs.profit
, cs.start_time
, cs.end_time
, cs.switch_name
FROM "Swiss_buz_scdr" AS cs
LEFT JOIN "Corporate_companyalias" AS cc ON cs.client_name_id = cc."Alias_name"
LEFT JOIN "RateManagement_prefix_and_client_wise_destinationgroup" AS rp
ON rp.client_name = cc.company_id AND rp.prefix = cs.c_prefix_id
WHERE cs.customer_name = 'SSP Root';
Run Code Online (Sandbox Code Playgroud)
不要cc
在外层和内层中使用相同的表别名SELECT
。虽然这并不违法,但它有助于混淆你。
如果没有对外部查询的引用的表限定,我不确定列client_name_id
和vendor_name_id
绑定在哪里。需要知道表定义,但我怀疑它会导致CROSS JOIN
s - 这可能不是您想要的,也是问题的根源。
我怀疑相关的子查询可以重写为普通表达式。也许它需要另一个JOIN
. 这是我的 ...
CREATE VIEW "Swiss_client_wise_minutes_and_profit" AS
SELECT ROW_NUMBER() OVER (ORDER BY r.country, r.destination) AS id
, c.company_id AS client_name
, v.company_id AS vendor_name
, s.c_prefix_id AS c_prefix
, s.v_prefix_id AS v_prefix
, r.country
, r.destination
, s.c_total_calls
, s.v_total_calls
, s.successful_calls
, s.billed_duration
, s.v_billed_amount AS cost
, s.c_billed_amount AS revenue
, s.c_pdd AS pdd
, s.profit
, s.start_time
, s.end_time
, s.switch_name
FROM "Swiss_buz_scdr" s
LEFT JOIN "Corporate_companyalias" c ON c."Alias_name" = s.client_name_id
LEFT JOIN "RateManagement_prefix_and_client_wise_destinationgroup" r
ON r.client_name = c.company_id AND r.prefix = s.c_prefix_id
LEFT JOIN "Corporate_companyalias" v ON v."Alias_name" = s.vendor_name_id
WHERE s.customer_name = 'SSP Root';
Run Code Online (Sandbox Code Playgroud)
旁白:我的目标是比"RateManagement_prefix_and_client_wise_destinationgroup"
. 最好是不需要双引号的合法的小写名称。
归档时间: |
|
查看次数: |
6823 次 |
最近记录: |