PostgreSQL 9.1 - 查询视图需要很多时间

Ali*_*ani 3 postgresql performance view partitioning subquery query-performance

使用 PostgreSQL 9.1,我们在 PostgreSQL 的 VIEW 上执行查询时遇到问题。以下是情况:

我们有一个分区表“buz_scdr”,我们在它上面构建了一个视图“Swiss_client_wise_minutes_and_profit”。此 VIEW 的目的是连接来自不同表(包括“buz_scdr”表)的数据以进行高效查询。这个策略一直运行良好,直到表“buz_scdr”变得巨大(所有分区中的整体记录变得巨大。该表基于日期进行分区)。

在此 VIEW 上执行的查询开始需要很长时间(大约 5 到 10 分钟)。为了弄清楚为什么这个查询需要这么长时间才能执行,我们使用 EXPLAIN 命令来显示它的执行计划。我们使用的查询如下:

EXPLAIN SELECT * from  "Swiss_client_wise_minutes_and_profit" where start_time = '2012-7-22 08:00';
Run Code Online (Sandbox Code Playgroud)

其结果在explain.depesz.com 上或如下:

   Subquery Scan on "Swiss_client_wise_minutes_and_profit"  (cost=2127919.71..94874537.55 rows=40474 width=677)
   Filter: ("Swiss_client_wise_minutes_and_profit".start_time = '2012-07-22 08:00:00+00'::timestamp with time zone)
   ->  WindowAgg  (cost=2127919.71..94773352.06 rows=8094839 width=148)
         ->  Sort  (cost=2127919.71..2148156.81 rows=8094839 width=148)
               Sort Key: cc.name, rdga.group_id
               ->  Hash Left Join  (cost=1661.50..604234.77 rows=8094839 width=148)
                     Hash Cond: (((cc.company_id)::text = (rdga.company_id)::text) AND ((cs.c_prefix_id)::text = (rdga.dest_id)::text))
                     ->  Hash Left Join  (cost=7.88..460615.39 rows=8094839 width=123)
                           Hash Cond: ((cs.client_name_id)::text = (cc."Alias_name")::text)
                           ->  Append  (cost=0.00..349303.48 rows=8094839 width=111)
                                 ->  Seq Scan on "Swiss_buz_scdr" cs  (cost=0.00..1.06 rows=1 width=610)
                                       Filter: ((customer_name)::text = 'SSP Root'::text)
                                 ->  Seq Scan on scdr_buz__2012_07_11 cs  (cost=0.00..349302.41 rows=8094838 width=111)
                                       Filter: ((customer_name)::text = 'SSP Root'::text)
                           ->  Hash  (cost=5.17..5.17 rows=217 width=24)
                                 ->  Seq Scan on "Corporate_companyalias" cc  (cost=0.00..5.17 rows=217 width=24)
                     ->  Hash  (cost=1334.42..1334.42 rows=21280 width=50)
                           ->  Hash Join  (cost=169.56..1334.42 rows=21280 width=50)
                                 Hash Cond: ((rdga.company_id)::text = (c.name)::text)
                                 ->  Hash Join  (cost=162.68..1034.93 rows=21280 width=50)
                                       Hash Cond: ((rdga.group_id)::text = (rdg.name)::text)
                                       ->  Seq Scan on "RateManagement_destgroupassign" rdga  (cost=0.00..497.35 rows=25935 width=40)
                                       ->  Hash  (cost=123.64..123.64 rows=3123 width=32)
                                             ->  Hash Join  (cost=13.08..123.64 rows=3123 width=32)
                                                   Hash Cond: (rdg.country_id = cc.id)
                                                   ->  Seq Scan on "RateManagement_destinationgroup" rdg  (cost=0.00..65.06 rows=3806 width=26)
                                                   ->  Hash  (cost=7.48..7.48 rows=448 width=14)
                                                         ->  Seq Scan on "Corporate_country" cc  (cost=0.00..7.48 rows=448 width=14)
                                 ->  Hash  (cost=4.17..4.17 rows=217 width=16)
                                       ->  Seq Scan on "Corporate_company" c  (cost=0.00..4.17 rows=217 width=16)
         SubPlan 1
           ->  Seq Scan on "Corporate_companyalias" cc  (cost=0.00..5.71 rows=1 width=12)
                 Filter: (("Alias_name")::text = (cs.client_name_id)::text)
         SubPlan 2
           ->  Seq Scan on "Corporate_companyalias" cc  (cost=0.00..5.71 rows=1 width=12)
                 Filter: (("Alias_name")::text = (cs.vendor_name_id)::text)
(36 rows)
Run Code Online (Sandbox Code Playgroud)

EXPLAIN 命令的上述结果显示我们的查询正在按顺序扫描“buz_scdr”表(如上所示),其中包含总共 8094838 条记录。VIEW 上的查询未遵循“buz_scdr”的分区约束(日期),这导致它扫描整个表。

出于实验目的,我们使用 WHERE 语句以及 date 和 time 直接在“buz_scdr”表上执行查询,它适当地遵守分区约束并且没有扫描整个表。这表明直接在分区表上执行的查询按预期工作,但基于它构建的 VIEW 有问题。

这是 PostgreSQL 视图的全局问题还是我错过了什么?

编辑:以下是视图“Swiss_client_wise_minutes_and_profit”的 DDL

CREATE VIEW "Swiss_client_wise_minutes_and_profit" 
    AS SELECT ROW_NUMBER() OVER (ORDER BY rp.country, rp.destination) 
    As id, (SELECT company_id FROM  "Corporate_companyalias" 
    AS cc WHERE cc."Alias_name" = client_name_id) 
    AS client_name, (SELECT company_id FROM  "Corporate_companyalias" 
    AS cc WHERE cc."Alias_name" = vendor_name_id) AS vendor_name, cs.c_prefix_id 
    AS c_prefix, cs.v_prefix_id AS v_prefix, rp.country, rp.destination, cs.c_total_calls, cs.v_total_calls, cs.successful_calls, cs.billed_duration, cs.v_billed_amount AS cost, cs.c_billed_amount 
    AS revenue, cs.c_pdd AS pdd, cs.profit, cs.start_time, cs.end_time, cs.switch_name FROM "Swiss_buz_scdr" 
    AS cs LEFT JOIN "Corporate_companyalias" AS cc ON cs.client_name_id = cc."Alias_name" LEFT JOIN "RateManagement_prefix_and_client_wise_destinationgroup" 
    AS rp ON rp.client_name = cc.company_id AND rp.prefix = cs.c_prefix_id WHERE cs.customer_name = 'SSP Root';
Run Code Online (Sandbox Code Playgroud)

编辑 2:这是“EXPLAIN ANALYZE”命令输出的链接:

在 pastebin 上解释分析输出

Erw*_*ter 5

它有助于正确格式化查询以查看发生了什么。我研究了您的查询并发现了可疑的 SQL:


CREATE VIEW "Swiss_client_wise_minutes_and_profit" AS
SELECT ROW_NUMBER() OVER (ORDER BY rp.country, rp.destination) AS id
     , (SELECT company_id
        FROM  "Corporate_companyalias" AS cc
        WHERE  cc."Alias_name" = client_name_id) AS client_name
     , (SELECT company_id
        FROM  "Corporate_companyalias" AS cc
        WHERE  cc."Alias_name" = vendor_name_id) AS vendor_name
     , cs.c_prefix_id AS c_prefix
     , cs.v_prefix_id AS v_prefix
     , rp.country
     , rp.destination
     , cs.c_total_calls
     , cs.v_total_calls
     , cs.successful_calls
     , cs.billed_duration
     , cs.v_billed_amount AS cost
     , cs.c_billed_amount AS revenue
     , cs.c_pdd AS pdd
     , cs.profit
     , cs.start_time
     , cs.end_time
     , cs.switch_name
FROM   "Swiss_buz_scdr" AS cs
LEFT   JOIN "Corporate_companyalias" AS cc ON cs.client_name_id = cc."Alias_name"
LEFT   JOIN "RateManagement_prefix_and_client_wise_destinationgroup" AS rp
         ON rp.client_name = cc.company_id AND rp.prefix = cs.c_prefix_id
WHERE  cs.customer_name = 'SSP Root';
Run Code Online (Sandbox Code Playgroud)
  • 不要cc在外层和内层中使用相同的表别名SELECT。虽然这并不违法,但它有助于混淆你。

  • 如果没有对外部查询的引用的表限定,我不确定列client_name_idvendor_name_id绑定在哪里。需要知道表定义,但我怀疑它会导致CROSS JOINs - 这可能不是您想要的,也是问题的根源。

我怀疑相关的子查询可以重写为普通表达式。也许它需要另一个JOIN. 这是我的 ...

受过教育的猜测你真正想要什么:

CREATE VIEW "Swiss_client_wise_minutes_and_profit" AS
SELECT ROW_NUMBER() OVER (ORDER BY r.country, r.destination) AS id
     , c.company_id AS client_name
     , v.company_id AS vendor_name
     , s.c_prefix_id AS c_prefix
     , s.v_prefix_id AS v_prefix
     , r.country
     , r.destination
     , s.c_total_calls
     , s.v_total_calls
     , s.successful_calls
     , s.billed_duration
     , s.v_billed_amount AS cost
     , s.c_billed_amount AS revenue
     , s.c_pdd AS pdd
     , s.profit
     , s.start_time
     , s.end_time
     , s.switch_name
FROM   "Swiss_buz_scdr" s
LEFT   JOIN "Corporate_companyalias" c ON c."Alias_name" = s.client_name_id
LEFT   JOIN "RateManagement_prefix_and_client_wise_destinationgroup" r
         ON r.client_name = c.company_id AND r.prefix = s.c_prefix_id
LEFT   JOIN "Corporate_companyalias" v ON v."Alias_name" = s.vendor_name_id
WHERE  s.customer_name = 'SSP Root';
Run Code Online (Sandbox Code Playgroud)

旁白:我的目标是比"RateManagement_prefix_and_client_wise_destinationgroup". 最好是不需要双引号的合法的小写名称。