Jan*_*ins 9 postgresql execution-plan postgresql-9.5
在对分区进行连接时与在整个表上进行连接时,以下连接具有非常不同的行估计:
CREATE TABLE m_data.ga_session (
session_id BIGINT NOT NULL,
visitor_id BIGINT NOT NULL,
transaction_id TEXT,
timestamp TIMESTAMP WITH TIME ZONE NOT NULL,
day_id INTEGER NOT NULL,
[...]
device_category TEXT NOT NULL,
[...]
operating_system TEXT
);
Run Code Online (Sandbox Code Playgroud)
对于所有分区:
CREATE TABLE IF NOT EXISTS m_data.ga_session_20170127 ( CHECK (day_id = 20170127) ) INHERITS (m_data.ga_session);
-- the identifier are theoretically invalid, but they get truncated to 63 chars and nevertheless work
CREATE INDEX IF NOT EXISTS "ga_session__m_tmp.normalize_device_category(ga_session.device_category)" on m_data.ga_session_20170127 USING btree (m_tmp.normalize_device_category(device_category)) ;
CREATE INDEX IF NOT EXISTS "ga_session__m_tmp.normalize_operating_system(operating_system)" on m_data.ga_session_20170127 USING btree (m_tmp.normalize_operating_system(operating_system)) ;
ANALYZE m_data.ga_session_20170127;
EXPLAIN analyse
SELECT *
FROM m_data.ga_session_20170127 ga_session
JOIN m_dim_next.device ON
device.device_category_name = m_tmp.normalize_device_category(ga_session.device_category)
AND device.operating_system_name = m_tmp.normalize_operating_system(ga_session.operating_system);
Run Code Online (Sandbox Code Playgroud)
分区上这些索引的统计信息是可见的:
SELECT * FROM pg_stats WHERE tablename ilike 'ga_session_20170127%';
schemaname |tablename |attname |inherited |null_frac |avg_width |n_distinct
-----------|----------------------------------------------------------------|---------------------------|----------|------------|----------|-------------
m_data |ga_session_20170127__m_tmp.normalize_device_category(device_cat |normalize_device_category |false |0 |10 |3
m_data |ga_session_20170127__m_tmp.normalize_operating_system(operating |normalize_operating_system |false |0 |7 |14
Run Code Online (Sandbox Code Playgroud)
这(对分区上的索引进行统计)导致以下(精细)查询计划估计:80146 估计,77503 实际
Hash Join (cost=1.95..6103.53 rows=80146 width=262) (actual time=0.121..117.204 rows=77503 loops=1)
Hash Cond: ((COALESCE(initcap(ga_session.device_category), 'Unknown'::text) = device.device_category_name) AND (COALESCE(replace(ga_session.operating_system, '(not set)'::text, 'Unknown'::text), 'Unknown'::text) = device.operating_system_name))
-> Seq Scan on ga_session_20170127 ga_session (cost=0.00..2975.03 rows=77503 width=224) (actual time=0.010..9.203 rows=77503 loops=1)
-> Hash (cost=1.38..1.38 rows=38 width=38) (actual time=0.064..0.064 rows=38 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 11kB
-> Seq Scan on device (cost=0.00..1.38 rows=38 width=38) (actual time=0.006..0.019 rows=38 loops=1)
Planning time: 1.460 ms
Execution time: 120.098 ms
Run Code Online (Sandbox Code Playgroud)
不起作用的是整个表上的连接,它估计了完全错误的行数(估计为 832,实际为 876237)。
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1.95..60056.78 rows=832 width=262) (actual time=0.037..1065.778 rows=876237 loops=1)
Hash Cond: ((COALESCE(initcap(ga_session.device_category), 'Unknown'::text) = device.device_category_name) AND (COALESCE(replace(ga_session.operating_system, '(not set)'::text, 'Unknown'::text), 'Unknown'::text) = device.operating_system_name))
-> Append (cost=0.00..33759.37 rows=876238 width=225) (actual time=0.005..132.070 rows=876237 loops=1)
-> Seq Scan on ga_session (cost=0.00..0.00 rows=1 width=319) (actual time=0.000..0.000 rows=0 loops=1)
-> Seq Scan on ga_session_20170125 ga_session_1 (cost=0.00..3648.38 rows=94438 width=226) (actual time=0.005..10.606 rows=94438 loops=1)
-> Seq Scan on ga_session_20170126 ga_session_2 (cost=0.00..3185.81 rows=82581 width=225) (actual time=0.014..8.982 rows=82581 loops=1)
-> Seq Scan on ga_session_20170127 ga_session_3 (cost=0.00..2975.03 rows=77503 width=224) (actual time=0.002..8.797 rows=77503 loops=1)
-> Seq Scan on ga_session_20170128 ga_session_4 (cost=0.00..2936.83 rows=76083 width=225) (actual time=0.003..7.873 rows=76083 loops=1)
-> Seq Scan on ga_session_20170129 ga_session_5 (cost=0.00..3716.18 rows=96618 width=224) (actual time=0.002..9.318 rows=96618 loops=1)
-> Seq Scan on ga_session_20170130 ga_session_6 (cost=0.00..3833.19 rows=99619 width=224) (actual time=0.002..9.453 rows=99619 loops=1)
-> Seq Scan on ga_session_20170131 ga_session_7 (cost=0.00..3488.79 rows=90579 width=225) (actual time=0.002..8.298 rows=90579 loops=1)
-> Seq Scan on ga_session_20170201 ga_session_8 (cost=0.00..3615.58 rows=93958 width=224) (actual time=0.002..9.199 rows=93958 loops=1)
-> Seq Scan on ga_session_20170202 ga_session_9 (cost=0.00..3286.56 rows=85256 width=224) (actual time=0.006..8.021 rows=85256 loops=1)
-> Seq Scan on ga_session_20170203 ga_session_10 (cost=0.00..3073.02 rows=79602 width=225) (actual time=0.002..7.727 rows=79602 loops=1)
-> Hash (cost=1.38..1.38 rows=38 width=38) (actual time=0.016..0.016 rows=38 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 11kB
-> Seq Scan on device (cost=0.00..1.38 rows=38 width=38) (actual time=0.002..0.004 rows=38 loops=1)
Planning time: 1.017 ms
Execution time: 1090.213 ms
Run Code Online (Sandbox Code Playgroud)
当使用该连接导致更多连接(此处未显示)时,这又会导致错误的连接选择(嵌套循环)。
在我ANALYSE
再次运行分区之前,我实际上也对分区进行了错误的行估计,因此查询规划器在使用整个表时似乎没有考虑基于索引的统计信息。
有什么方法可以让查询计划器在父表级别收集统计信息或在构建查询计划时考虑分区的个别统计信息?
确保不仅对分区建立索引,而且对主表也以相同的方式建立索引并进行分析。
这可以使规划器在单个分区上包含基于索引的估计,但在主表级别上忽略它们。
如果主表的表达式索引或统计信息丢失,规划器就无法从此条件推断连接基数 - 即使它具有完美的分区统计信息。
这只是一个猜测,因为您没有提供完整的架构。请让我知道这可不可以帮你。