为什么 log(greatest()) 这么慢？

Question

为什么 log(greatest()) 这么慢？

TmT*_*ron 3 postgresql performance query-performance

我们有一些非常慢的复杂查询。我设法将查询简化为简单的复制。看来，组合greatest和log是原因，但我不明白为什么。

这是运行查询的完整sql-fiddle 示例- 您也可以View the execution Plans查询（按 sql-fiddle 页面上查询结果底部的链接）

所以这里是慢查询：

select count(value)
from (
         SELECT  log(greatest(1e-9, x)) as value
         from (select generate_series(1, 20000, 1) as x) as d
     ) t;

Run Code Online (Sandbox Code Playgroud)

我们只是生成一系列 20k 数字并使用log(greatest()). 此查询大约需要1.5秒。

我认为计算日志可能需要很长时间，但以下查询也很快（~5ms）：

select count(value)
from (
         SELECT  log(x) as value
         from (select generate_series(1, 20000, 1) as x) as d
     ) t;

Run Code Online (Sandbox Code Playgroud)

正如测试我交换greatest和log-这也是快速（大约为5ms）：

select count(value)
from (
         SELECT  greatest(1e-9, log(x)) as value
         from (select generate_series(1, 20000, 1) as x) as d
     ) t;

Run Code Online (Sandbox Code Playgroud)

在QUERY PLANS所有3个查询是相同的：

Aggregate (cost=22.51..22.52 rows=1 width=8)
-> Result (cost=0.00..5.01 rows=1000 width=4)

Run Code Online (Sandbox Code Playgroud)

谁能解释为什么第一个查询这么慢- 也许有人知道解决方法？

更多细节

慢平台

我在所有这些上都得到了类似的结果（第一个查询速度慢了很多）：

SQL Fiddle 使用 pg 9.6
我的本地 PC 具有类似的结果：Win10 64bit, pg 11.5 running in Docker
远程服务器：Ubuntu 18.04 64 位在 Docker 中运行 pg 11.5
雷克斯特.com
- 慢查询~ 3sec
- 快速查询~0.5秒

数数

当我更改count(value)为count(*)or count(1)（第一）时，查询速度很快

但这对我没有帮助，因为生产查询甚至不包括计数
无论如何，我想知道为什么在这种情况下会有所不同（数据中没有空值）

Answer 1

Dan*_*ité 6

您在这里调用了两个不同的日志函数：log(numeric,numeric)and log(double precision)，第一个比第二个慢得多。

请注意在下面的 EXPLAIN (ANALYZE, VERBOSE) 中函数调用的不同，在 PostgreSQL 11.5 (Linux Ubuntu) 上运行：

慢版：

explain (analyze, verbose) select count(value)
from (
         SELECT  log(greatest(1e-9, x)) as value
         from (select generate_series(1, 20000, 1) as x) as d
     ) t;
                                              QUERY PLAN                                               
-------------------------------------------------------------------------------------------------------
 Aggregate  (cost=25.02..25.03 rows=1 width=8) (actual time=1174.349..1174.349 rows=1 loops=1)
   Output: count(log('10'::numeric, GREATEST(0.000000001, ((generate_series(1, 20000, 1)))::numeric)))
   ->  ProjectSet  (cost=0.00..5.02 rows=1000 width=4) (actual time=0.004..1.310 rows=20000 loops=1)
         Output: generate_series(1, 20000, 1)
         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
 Planning Time: 0.123 ms
 Execution Time: 1174.385 ms

Run Code Online (Sandbox Code Playgroud)

快速版：

explain (analyze, verbose) select count(value)
from (
         SELECT  log(greatest(1e-9::float, x)) as value
         from (select generate_series(1, 20000, 1) as x) as d
     ) t;
                                                  QUERY PLAN                                                   
---------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=25.02..25.03 rows=1 width=8) (actual time=6.693..6.693 rows=1 loops=1)
   Output: count(log(GREATEST('1e-09'::double precision, ((generate_series(1, 20000, 1)))::double precision)))
   ->  ProjectSet  (cost=0.00..5.02 rows=1000 width=4) (actual time=0.004..2.561 rows=20000 loops=1)
         Output: generate_series(1, 20000, 1)
         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
 Planning Time: 0.096 ms
 Execution Time: 6.731 ms

Run Code Online (Sandbox Code Playgroud)

greatest()不负责任：考虑使用 just 的查询log(x)，如果你转换x到numeric它，无论有没有greatest().

归档时间：	6 年，7 月前
查看次数：	920 次
最近记录：	6 年，5 月前