为什么 BETWEEN 使用 btree 索引,但“元素包含于”范围运算符 (<@) 却不使用?

Lon*_*Rob 4 postgresql index execution-plan

我有一个表,utc timestamptz其中的列上有一个“btree”索引utc

CREATE TABLE foo(utc timestamptz)

CREATE INDEX ix_foo_utc ON foo (utc);
Run Code Online (Sandbox Code Playgroud)

该表包含大约5亿行数据。

当我utc使用 进行过滤时BETWEEN,查询规划器按预期使用索引:

> EXPLAIN ANALYZE
SELECT
   utc
FROM foo
WHERE
    utc BETWEEN '2020-12-01' AND '2031-02-15'
;

QUERY PLAN
Bitmap Heap Scan on foo  (cost=3048368.34..11836322.22 rows=143671392 width=8) (actual time=12447.905..165576.664 rows=150225530 loops=1)
Recheck Cond: ((utc >= '2020-12-01 00:00:00+00'::timestamp with time zone) AND (utc <= '2031-02-15 00:00:00+00'::timestamp with time zone))
Rows Removed by Index Recheck: 543231
Heap Blocks: exact=43537 lossy=1818365
->  Bitmap Index Scan on ix_foo_utc  (cost=0.00..3012450.49 rows=143671392 width=0) (actual time=12436.236..12436.236 rows=150225530 loops=1)
Index Cond: ((utc >= '2020-12-01 00:00:00+00'::timestamp with time zone) AND (utc <= '2031-02-15 00:00:00+00'::timestamp with time zone))
Planning time: 0.127 ms
Execution time: 172335.517 ms

Run Code Online (Sandbox Code Playgroud)

但是,如果我使用范围运算符运行相同的查询,则不会使用索引:

> EXPLAIN ANALYZE
SELECT
   utc
FROM quotation.half_hour_data
WHERE
    utc <@ tstzrange('2020-12-01', '2031-02-15')
;

QUERY PLAN
Gather  (cost=1000.00..9552135.30 rows=2556133 width=8) (actual time=0.179..145303.094 rows=150225530 loops=1)
Workers Planned: 2
Workers Launched: 2
->  Parallel Seq Scan on foo  (cost=0.00..9295522.00 rows=1065055 width=8) (actual time=5.321..117837.452 rows=50075177 loops=3)
"Filter: (utc <@ '[""2020-12-01 00:00:00+00"",""2031-02-15 00:00:00+00"")'::tstzrange)
Rows Removed by Filter: 120333718
Planning time: 0.069 ms
Execution time: 153384.494 ms
Run Code Online (Sandbox Code Playgroud)

我本希望查询规划器意识到它们正在执行相同的操作(尽管这<@是右手排他性的并且BETWEEN是包容性的。)

那么为什么这些查询计划如此不同呢?(忘了问为什么顺序扫描查询完成得更快?!!)


我的 Postgres 版本:

"PostgreSQL 10.13 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit"
Run Code Online (Sandbox Code Playgroud)

Lau*_*lbe 5

索引只能支持属于其运算符类的运算符。

SELECT ao.amoplefttype::regtype,
       op.oprname,
       ao.amoprighttype::regtype
FROM pg_opfamily AS of
   JOIN pg_am AS am ON of.opfmethod = am.oid
   JOIN pg_amop AS ao ON of.oid = ao.amopfamily
   JOIN pg_operator AS op ON ao.amopopr = op.oid
WHERE am.amname = 'btree'
  AND ao.amoplefttype = 'timestamptz'::regtype;

       amoplefttype       | oprname |        amoprighttype        
--------------------------+---------+-----------------------------
 timestamp with time zone | <       | date
 timestamp with time zone | <=      | date
 timestamp with time zone | =       | date
 timestamp with time zone | >=      | date
 timestamp with time zone | >       | date
 timestamp with time zone | <       | timestamp without time zone
 timestamp with time zone | <=      | timestamp without time zone
 timestamp with time zone | =       | timestamp without time zone
 timestamp with time zone | >=      | timestamp without time zone
 timestamp with time zone | >       | timestamp without time zone
 timestamp with time zone | <       | timestamp with time zone
 timestamp with time zone | <=      | timestamp with time zone
 timestamp with time zone | =       | timestamp with time zone
 timestamp with time zone | >=      | timestamp with time zone
 timestamp with time zone | >       | timestamp with time zone
(15 rows)
Run Code Online (Sandbox Code Playgroud)

其中没有<@运算符,因此B树索引不支持该运算符。

GIN 索引可以支持<@,但不支持右侧的常量。

您将必须重写查询并使用BETWEEN.

备注:这并没有什么根本原因,这只是 PostgreSQL 中索引的工作方式而已。甚至可以编写一个优化器支持函数来实现这一点,但是您的用例是如此奇特,以至于 PostgreSQL 不想在其上花费优化器时间和开发精力。