Postgres not using partial timestamp index on interval queries (e.g., now() - interval '7 days' )

Deb*_*ser 2 postgresql timestamp-with-timezone

I have a simple table that store precipitation readings from online gauges. Here's the table definition:

    CREATE TABLE public.precip
    (
        gauge_id smallint,
        inches numeric(8, 2),
        reading_time timestamp with time zone
    )

    CREATE INDEX idx_precip3_id
        ON public.precip USING btree
        (gauge_id)

    CREATE INDEX idx_precip3_reading_time
        ON public.precip USING btree
        (reading_time)

CREATE INDEX idx_precip_last_five_days
    ON public.precip USING btree
    (reading_time)
    TABLESPACE pg_default    WHERE reading_time > '2017-02-26 00:00:00+00'::timestamp with time zone
Run Code Online (Sandbox Code Playgroud)

It's grown quite large: about 38 million records that go back 18 months. Queries rarely request rows that are more than 7 days old and I created the partial index on the reading_time field so Postgres can traverse a much smaller index. But it's not using the partial index on all queries. It does use the partial index on

explain analyze select * from precip where gauge_id = 208 and reading_time > '2017-02-27' 
            Bitmap Heap Scan on precip  (cost=8371.94..12864.51 rows=1169 width=16) (actual time=82.216..162.127 rows=2046 loops=1)   
            Recheck Cond: ((gauge_id = 208) AND (reading_time > '2017-02-27 00:00:00+00'::timestamp with time zone))
           ->  BitmapAnd  (cost=8371.94..8371.94 rows=1169 width=0) (actual time=82.183..82.183 rows=0 loops=1)
                ->  Bitmap Index Scan on idx_precip3_id  (cost=0.00..2235.98 rows=119922 width=0) (actual time=20.754..20.754 rows=125601 loops=1)
                      Index Cond: (gauge_id = 208)
                ->  Bitmap Index Scan on idx_precip_last_five_days  (cost=0.00..6135.13 rows=331560 width=0) (actual time=60.099..60.099 rows=520867 loops=1) 
    Total runtime: 162.631 ms
Run Code Online (Sandbox Code Playgroud)

But it does not use the partial index on the following. Instead, it's use the full index on reading_time

 explain analyze select * from precip where gauge_id = 208 and reading_time > now() - interval '7 days' 

Bitmap Heap Scan on precip  (cost=8460.10..13007.47 rows=1182 width=16) (actual time=154.286..228.752 rows=2067 loops=1)
   Recheck Cond: ((gauge_id = 208) AND (reading_time > (now() - '7 days'::interval)))
      ->  BitmapAnd  (cost=8460.10..8460.10 rows=1182 width=0) (actual time=153.799..153.799 rows=0 loops=1)
              ->  Bitmap Index Scan on idx_precip3_id  (cost=0.00..2235.98 rows=119922 width=0) (actual time=15.852..15.852 rows=125601 loops=1)
                   Index Cond: (gauge_id = 208)
        ->  Bitmap Index Scan on idx_precip3_reading_time  (cost=0.00..6223.28 rows=335295 width=0) (actual time=136.162..136.162 rows=522993 loops=1)
              Index Cond: (reading_time > (now() - '7 days'::interval))
Total runtime: 228.647 ms
Run Code Online (Sandbox Code Playgroud)

Note that today is 3/5/2017, so these two queries are essentially requesting the rows. But it seems like Postgres won't use the partial index unless the timestamp in the where clause is "hard coded". Is the query planner not evaluating now() - interval '7 days' before deciding which index to use? I posted the query plans as suggested by one of the first people to respond.
I've written several functions (stored procedures) that summarize rain fall in the last 6 hours, 12 hours .... 72 hours. They all use the interval approach in the query (e.g., reading_time > now() - interval '7 days'). I don't want to move this code into the application to send Postgres the hard coded timestamp. That would create a lot of messy php code that shouldn't be necessary.

Suggestions on how to encourage Postgres to use the partial index instead? My plan is to redefine the date range on the partial index nightly (drop index --> create index), but that seems a bit silly if Postgres isn't going to use it.

Thanks,

Alex

poz*_*ozs 7

一般而言,当索引列与常量(文字值)、函数调用进行比较时,可以使用索引,这些调用至少被标记STABLE(这意味着在单个语句中,函数的多次调用 - - 使用相同的参数 - 将产生相同的结果),以及它们的组合。

now()(它是 的别名current_timestamp)被标记为STABLEtimestamp_mi_interval()(它是运算符的后备函数<timestamp> - <interval>)被标记为IMMUTABLE,这比STABLE(更严格now()current_timestamp并且transaction_timestamp 标记事务statement_timestamp()的开始,标记语句的开始——仍然STABLE- 但clock_timestamp()给出了在时钟上看到的时间戳,因此它是VOLATILE)。

所以理论上,WHERE reading_time > now() - interval '7 days'应该能够在reading_time列上使用索引。确实如此。但是,由于您定义了部分索引,规划器需要证明以下内容

但是,请记住,谓词必须与应该从索引中受益的查询中使用的条件相匹配。准确地说,只有当系统能够识别查询的 WHERE 条件在数学上暗示索引的谓词时,才能在查询中使用部分索引。PostgreSQL 没有复杂的定理证明器可以识别以不同形式编写的数学等效表达式。(不仅如此一般的定理证明器极难创建,而且可能太慢而无法实际使用。)系统可以识别简单的不等式含义,例如“x < 1”意味着“x < 2”;否则谓词条件必须完全匹配查询的一部分'否则索引将不会被识别为可用。匹配发生在查询计划时,而不是运行时。

这就是您的查询正在发生的事情,它具有and reading_time > now() - interval '7 days'. 到now() - interval '7 days'评估时,计划已经发生。并且 PostgreSQL 无法证明谓词 ( reading_time > '2017-02-26 00:00:00+00') 将是true. 但是当你使用reading_time > '2017-02-27'它时,它可以证明这一点。

您可以使用常量值“引导”规划器,如下所示:

where gauge_id = 208
and   reading_time > '2017-02-26 00:00:00+00'
and   reading_time > now() - interval '7 days'
Run Code Online (Sandbox Code Playgroud)

通过这种方式,规划器意识到,它可以使用部分索引,因为indexed_col > index_condition并且indexed_col > something_else暗示这indexed_col将大于(至少)index_condition。也许它会比something_else太大,但使用索引并不重要。

我不确定这是否是您正在寻找的答案。恕我直言,如果您有大量数据(和 PostgreSQL 9.5+),单个BRIN索引可能更适合您的需求。