PostgreSQL - "IN"子句中的最大参数数量？

Question

PostgreSQL - "IN"子句中的最大参数数量？

在Postgres中,您可以指定一个IN子句,如下所示:

SELECT * FROM user WHERE id IN (1000, 1001, 1002)

Run Code Online (Sandbox Code Playgroud)

有谁知道你可以传入IN的最大参数数量是多少？

Answer 1

Jor*_*nes 76

根据此处的源代码,从第850行开始, PostgreSQL没有明确限制参数的数量.

以下是第870行的代码注释:

/*
 * We try to generate a ScalarArrayOpExpr from IN/NOT IN, but this is only
 * possible if the inputs are all scalars (no RowExprs) and there is a
 * suitable array type available.  If not, we fall back to a boolean
 * condition tree with multiple copies of the lefthand expression.
 * Also, any IN-list items that contain Vars are handled as separate
 * boolean conditions, because that gives the planner more scope for
 * optimization on such clauses.
 *
 * First step: transform all the inputs, and detect whether any are
 * RowExprs or contain Vars.
 */

Run Code Online (Sandbox Code Playgroud)

Answer 2

nim*_*mai 48

这不是现在问题的真正答案,但它也可能对其他人有所帮助.

至少我可以说使用Posgresql的JDBC驱动程序9.1,可以将32767个值(= Short.MAX_VALUE)的技术限制传递给PostgreSQL后端.

这是使用postgresql jdbc驱动程序测试"从x中删除(... 100k值...)中的id":

Caused by: java.io.IOException: Tried to send an out-of-range integer as a 2-byte value: 100000
    at org.postgresql.core.PGStream.SendInteger2(PGStream.java:201)

Run Code Online (Sandbox Code Playgroud)

OP 已经询问了 DB 引擎限制，但是我来这里搜索 JDBC 限制，这就是我想要的。所以有一个限制，但是，相当高。 (9认同)

Answer 3

小智 34

explain select * from test where id in (values (1), (2));

Run Code Online (Sandbox Code Playgroud)

查询计划

 Seq Scan on test  (cost=0.00..1.38 rows=2 width=208)
   Filter: (id = ANY ('{1,2}'::bigint[]))

Run Code Online (Sandbox Code Playgroud)

但如果尝试第二次查询:

explain select * from test where id = any (values (1), (2));

Run Code Online (Sandbox Code Playgroud)

查询计划

Hash Semi Join  (cost=0.05..1.45 rows=2 width=208)
       Hash Cond: (test.id = "*VALUES*".column1)
       ->  Seq Scan on test  (cost=0.00..1.30 rows=30 width=208)
       ->  Hash  (cost=0.03..0.03 rows=2 width=4)
             ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=4)

Run Code Online (Sandbox Code Playgroud)

我们可以看到postgres构建临时表并加入它

Answer 4

Pra*_*ran 17

您传递给IN子句的元素数量没有限制.如果有更多元素,它会将其视为数组,然后对于数据库中的每次扫描,它将检查它是否包含在数组中.这种方法不具备可扩展性.而不是使用IN子句尝试使用INNER JOIN与临时表.有关详细信息,请参阅http://www.xaprb.com/blog/2006/06/28/why-large-in-clauses-are-problematic/.使用INNER JOIN可以很好地扩展,因为查询优化器可以使用散列连接和其他优化.而使用IN子句,优化器无法优化查询.我注意到这个改变至少加速了2倍.

您所指的链接没有说明它在谈论什么 DBMS。虽然我可以在 Oracle DB 上确认这一点，但由于解析和规划此类查询的开销很大，因此使用临时表比使用组合“OR”和“IN”子句的查询带来了巨大的性能提升，但我无法确认 Postgres 9.5 的问题，见[这个答案](/sf/answers/3341084911/)。 (2认同)

Answer 5

blu*_*ubb 12

作为对Oracle DB更有经验的人,我也担心这个限制.我在IN-list中对~10'000个参数的查询进行了性能测试,通过实际列出所有素数作为查询参数,从具有前100'000个整数的表中获取高达100'000 的素数.

我的结果表明,您不必担心重载查询计划优化器或获取没有索引使用的计划,因为它会将查询转换为= ANY({...}::integer[])可以按预期利用索引的位置:

-- prepare statement, runs instantaneous:
PREPARE hugeplan (integer, integer, integer, ...) AS
SELECT *
FROM primes
WHERE n IN ($1, $2, $3, ..., $9592);

-- fetch the prime numbers:
EXECUTE hugeplan(2, 3, 5, ..., 99991);

-- EXPLAIN ANALYZE output for the EXECUTE:
"Index Scan using n_idx on primes  (cost=0.42..9750.77 rows=9592 width=5) (actual time=0.024..15.268 rows=9592 loops=1)"
"  Index Cond: (n = ANY ('{2,3,5,7, (...)"
"Execution time: 16.063 ms"

-- setup, should you care:
CREATE TABLE public.primes
(
  n integer NOT NULL,
  prime boolean,
  CONSTRAINT n_idx PRIMARY KEY (n)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE public.primes
  OWNER TO postgres;

INSERT INTO public.primes
SELECT generate_series(1,100000);

Run Code Online (Sandbox Code Playgroud)

Answer 6

And*_*rew 7

刚试过。答案是 -> 超出范围的整数作为 2 字节值：32768

Answer 7

Pat*_*and 0

您可能需要考虑重构该查询，而不是添加任意长的 ids 列表...如果 ids 确实遵循示例中的模式，则可以使用范围：

SELECT * FROM user WHERE id >= minValue AND id <= maxValue;

Run Code Online (Sandbox Code Playgroud)

另一种选择是添加内部选择：

SELECT * 
FROM user 
WHERE id IN (
    SELECT userId
    FROM ForumThreads ft
    WHERE ft.id = X
);

Run Code Online (Sandbox Code Playgroud)

归档时间：	16 年，7 月前
查看次数：	82573 次
最近记录：	6 年，5 月前