postgres功能:IMMUTABLE什么时候会伤害性能?

bra*_*ahn 7 postgresql performance user-defined-functions

Postgres的文档

为了获得最佳的优化结果,您应该使用对其有效的最严格的波动率类别标记您的函数.

但是,我似乎有一个例子,并非如此,我想了解发生了什么.(背景:我正在运行postgres 9.2)

我经常需要将表示为整数秒的时间转换为日期.我写了一个函数来做到这一点:

CREATE OR REPLACE FUNCTION 
  to_datestamp(time_int double precision) RETURNS date AS $$
  SELECT date_trunc('day', to_timestamp($1))::date;
$$ LANGUAGE SQL;
Run Code Online (Sandbox Code Playgroud)

让我们将性能与其他相同的函数进行比较,将波动率设置为IMMUTABLE和STABLE:

CREATE OR REPLACE FUNCTION 
  to_datestamp_immutable(time_int double precision) RETURNS date AS $$
  SELECT date_trunc('day', to_timestamp($1))::date;
$$ LANGUAGE SQL IMMUTABLE;
Run Code Online (Sandbox Code Playgroud)
CREATE OR REPLACE FUNCTION 
  to_datestamp_stable(time_int double precision) RETURNS date AS $$
  SELECT date_trunc('day', to_timestamp($1))::date;
$$ LANGUAGE SQL STABLE;
Run Code Online (Sandbox Code Playgroud)

为了测试这个,我将创建一个10 ^ 6个随机整数的表,对应于2010-01-01和2015-01-01之间的时间

CREATE TEMPORARY TABLE random_times AS
  SELECT 1262304000 + round(random() * 157766400) AS time_int 
  FROM generate_series(1, 1000000) x;
Run Code Online (Sandbox Code Playgroud)

最后,我会在这张桌子上调用两个函数; 在我的特定盒子上,原始需要约6秒,不可变版本需要~33秒,而稳定版本需要约6秒.

EXPLAIN ANALYZE SELECT to_datestamp(time_int) FROM random_times;

Seq Scan on random_times  (cost=0.00..20996.62 rows=946950 width=8) 
  (actual time=0.150..5493.722 rows=1000000 loops=1)
Total runtime: 6258.827 ms


EXPLAIN ANALYZE SELECT to_datestamp_immutable(time_int) FROM random_times;

Seq Scan on random_times  (cost=0.00..250632.00 rows=946950 width=8) 
  (actual time=0.211..32209.964 rows=1000000 loops=1)
Total runtime: 33060.918 ms


EXPLAIN ANALYZE SELECT to_datestamp_stable(time_int) FROM random_times;
Seq Scan on random_times  (cost=0.00..20996.62 rows=946950 width=8)
  (actual time=0.086..5295.608 rows=1000000 loops=1)
Total runtime: 6063.498 ms
Run Code Online (Sandbox Code Playgroud)

这里发生了什么?例如,postgres花费时间缓存结果时实际上没有帮助,因为传递给函数的参数不太可能重复?

(我正在运行postgres 9.2.)

谢谢!

UPDATE

感谢Craig Ringer,已在pgsql-performance邮件列表中讨论过这个问题.强调:

汤姆莱恩说

[耸耸肩......]使用IMMUTABLE来解释函数的可变性(在这种情况下,date_trunc)是一个坏主意.它可能会导致错误的答案,更不用说性能问题了.在这种特殊情况下,我认为性能问题来自于抑制了内联函数体的选项......但是你应该更担心的是在其他情况下你是否没有得到平坦的虚假答案.

Pavel Stehule说

如果我理解,使用的IMMUTABLE标志禁用内联.你看到的是SQL eval溢出.我的规则是 - 在可能的情况下,不要在SQL函数中使用标志.

Clo*_*eto 3

问题是to_timestamp返回带有时区的时间戳。如果将该to_timestamp函数替换为不带时区的“手动”计算,则性能没有差异

create or replace function to_datestamp_stable(
    time_int double precision
) returns date as $$
  select date_trunc('day', timestamp 'epoch' + $1 * interval '1 second')::date;
$$ language sql stable;

explain analyze
select to_datestamp_stable(a)
from generate_series(1, 1000000) s (a);
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series s  (cost=0.00..22.50 rows=1000 width=4) (actual time=96.962..433.562 rows=1000000 loops=1)
 Total runtime: 459.531 ms

create or replace function to_datestamp_immutable(
    time_int double precision
) returns date as $$
  select date_trunc('day', timestamp 'epoch' + $1 * interval '1 second')::date;
$$ language sql immutable;

explain analyze
select to_datestamp_immutable(a)
from generate_series(1, 1000000) s (a);
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series s  (cost=0.00..22.50 rows=1000 width=4) (actual time=94.188..433.492 rows=1000000 loops=1)
 Total runtime: 459.434 ms
Run Code Online (Sandbox Code Playgroud)

相同的功能使用to_timestamp

create or replace function to_datestamp_stable(
    time_int double precision
) returns date as $$
  select date_trunc('day', to_timestamp($1))::date;
$$ language sql stable;

explain analyze
select to_datestamp_stable(a)
from generate_series(1, 1000000) s (a);
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series s  (cost=0.00..20.00 rows=1000 width=4) (actual time=91.924..3059.570 rows=1000000 loops=1)
 Total runtime: 3103.655 ms

create or replace function to_datestamp_immutable(
    time_int double precision
) returns date as $$
  select date_trunc('day', to_timestamp($1))::date;
$$ language sql immutable;

explain analyze
select to_datestamp_immutable(a)
from generate_series(1, 1000000) s (a);
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series s  (cost=0.00..262.50 rows=1000 width=4) (actual time=92.639..20083.920 rows=1000000 loops=1)
 Total runtime: 20149.311 ms
Run Code Online (Sandbox Code Playgroud)