使用 postgresql_fdw 手动执行特定的远程查询

Question

使用 postgresql_fdw 手动执行特定的远程查询

zwo*_*wol 6 postgresql foreign-data postgresql-fdw

在 9.4b2 中，postgresql_fdw不知道如何在远程表上“下推”聚合查询，例如

> explain verbose select max(col1) from remote_tables.table1;
                                         QUERY PLAN                                          
---------------------------------------------------------------------------------------------
 Aggregate  (cost=605587.30..605587.31 rows=1 width=4)
   Output: max(col1)
   ->  Foreign Scan on remote_tables.table1  (cost=100.00..565653.20 rows=15973640 width=4)
         Output: col1, col2, col3
         Remote SQL: SELECT col1 FROM public.table1

Run Code Online (Sandbox Code Playgroud)

显然，发送SELECT max(col1) FROM public.table1到远程服务器并只将一行拉回来会更有效率。

有没有办法手动执行此优化？我会对像（假设地说）这样低级的东西感到满意

EXECUTE 'SELECT max(col1) FROM public.table1' ON remote RETURNING (col1 INTEGER);

Run Code Online (Sandbox Code Playgroud)

虽然当然更喜欢更高级别的构造。

我知道我可以用做这样的事情dblink，但这将涉及重写大量已经使用外部表的代码，所以我不想这样做。

编辑：这是 Erwin Brandstetter 建议的查询计划：

=> explain verbose select col1 from remote_tables.table1 
-> order by col1 desc nulls last limit 1;
                                            QUERY PLAN                                             
---------------------------------------------------------------------------------------------------
 Limit  (cost=645521.40..645521.40 rows=1 width=4)
   Output: url
   ->  Sort  (cost=645521.40..685455.50 rows=15973640 width=4)
         Output: col1
         Sort Key: table1.col1
         ->  Foreign Scan on remote_tables.table1  (cost=100.00..565653.20 rows=15973640 width=4)
               Output: col1
               Remote SQL: SELECT col1 FROM public.table1

Run Code Online (Sandbox Code Playgroud)

这更好，因为它只获取col1，但它仍然在网络上拖动 1600 万行，现在它也在对它们进行排序。相比之下，应用于远程服务器的原始查询甚至不必扫描，因为该列具有索引。（核心查询规划器不够聪明，无法对应用于远程服务器的修改后的查询执行此操作，但这只是次要的。）

Answer 1

Erw*_*ter 0

远程查询优化非常基本：

postgres_fdw尝试优化远程查询以减少从外部服务器传输的数据量。这是通过将查询WHERE子句发送到远程服务器来执行，并且不检索当前查询不需要的表列来完成的。[...]

正如您发现的那样，我用以下内容替换的第一个想法也没有多大改进：

SELECT col1
FROM   public.table1
ORDER  BY col1 DESC NULLS LAST
LIMIT  1;

Run Code Online (Sandbox Code Playgroud)

目前（包括第 9.4 页），只有WHERE具有所有不可变函数的条件才会被下推。我发现这个详尽的帖子讨论了pgsql-hackers 上FDW 下推的状态。

您最好的选择似乎是使用 dblink 就像您已经提到的那样。

归档时间：	11 年，1 月前
查看次数：	1450 次
最近记录：	9 年，10 月前