检查Postgres数组中是否存在NULL

Mik*_*e T 11 sql arrays postgresql null postgresql-9.1

此问题类似,如何查找数组中是否存在NULL值?

这是一些尝试.

SELECT num, ar, expected,
  ar @> ARRAY[NULL]::int[] AS test1,
  NULL = ANY (ar) AS test2,
  array_to_string(ar, ', ') <> array_to_string(ar, ', ', '(null)') AS test3
FROM (
  SELECT 1 AS num, '{1,2,NULL}'::int[] AS ar, true AS expected
  UNION SELECT 2, '{1,2,3}'::int[], false
) td ORDER BY num;

 num |     ar     | expected | test1 | test2 | test3
-----+------------+----------+-------+-------+-------
   1 | {1,2,NULL} | t        | f     |       | t
   2 | {1,2,3}    | f        | f     |       | f
(2 rows)
Run Code Online (Sandbox Code Playgroud)

只有一个技巧array_to_string显示预期值.有没有更好的方法来测试这个?

Erw*_*ter 16

如果您知道数组中永远不存在的单个元素,则可以在Postgres 9.1(或任何 Postgres版本)中使用此快速表达式.假设你有一组正数,所以不能在其中:-1

-1 = ANY(ar) IS NULL
Run Code Online (Sandbox Code Playgroud)

相关答案详细说明:

如果你不能完全确定,你可以回到昂贵但安全的方法之一unnest().喜欢:

(SELECT bool_or(x IS NULL) FROM unnest(ar) x)
Run Code Online (Sandbox Code Playgroud)

要么:

EXISTS (SELECT 1 FROM unnest(ar) x WHERE x IS NULL)
Run Code Online (Sandbox Code Playgroud)

但是你可以快速安全地使用CASE表达式.使用不太可能的数字,如果它存在,则回退到安全方法.您可能希望ar IS NULL单独处理案例.见下面的演示.


Postgres 9.1正在变老.考虑升级到当前版本.在Postgres 9.3或更高版本中,您可以使用内置函数array_remove()array_replace().

或者你可以array_position()在Postgres 9.5或更高版本中使用,例如@Patrick提供的.我在下面添加了改进的变体

演示

SELECT num, ar, expect
     , -1 = ANY(ar) IS NULL                                   AS t_1   --  50 ms
     , (SELECT bool_or(x IS NULL) FROM unnest(ar) x)          AS t_2   -- 754 ms
     , EXISTS (SELECT 1 FROM unnest(ar) x WHERE x IS NULL)    AS t_3   -- 521 ms
     , CASE -1 = ANY(ar)
         WHEN FALSE THEN FALSE
         WHEN TRUE THEN EXISTS (SELECT 1 FROM unnest(ar) x WHERE x IS NULL)
         ELSE NULLIF(ar IS NOT NULL, FALSE)  -- catch ar IS NULL       --  55 ms
      -- ELSE TRUE  -- simpler for columns defined NOT NULL            --  51 ms
       END                                                    AS t_91
     , array_replace(ar, NULL, 0) <> ar                       AS t_93a --  99 ms
     , array_remove(ar, NULL) <> ar                           AS t_93b --  96 ms
     , cardinality(array_remove(ar, NULL)) <> cardinality(ar) AS t_94  --  81 ms
     , COALESCE(array_position(ar, NULL::int), 0) > 0         AS t_95a --  49 ms
     , array_position(ar, NULL) IS NOT NULL                   AS t_95b --  45 ms
     , CASE WHEN ar IS NOT NULL
            THEN array_position(ar, NULL) IS NOT NULL END     AS t_95c --  48 ms
FROM  (
   VALUES (1, '{1,2,NULL}'::int[], true)     -- extended test case
        , (2, '{-1,NULL,2}'      , true)
        , (3, '{NULL}'           , true)
        , (4, '{1,2,3}'          , false)
        , (5, '{-1,2,3}'         , false)
        , (6, NULL               , null)
   ) t(num, ar, expect);
Run Code Online (Sandbox Code Playgroud)

结果:

 num |     ar      | expect | t_1    | t_2  | t_3 | t_91 | t_93a | t_93b | t_94 | t_95a | t_95b | t_95c
-----+-------------+--------+--------+------+-----+------+-------+-------+------+-------+-------+-------
   1 | {1,2,NULL}  | t      | t      | t    | t   | t    | t     | t     | t    | t     | t     | t
   2 | {-1,NULL,2} | t      | f --!! | t    | t   | t    | t     | t     | t    | t     | t     | t
   3 | {NULL}      | t      | t      | t    | t   | t    | t     | t     | t    | t     | t     | t
   4 | {1,2,3}     | f      | f      | f    | f   | f    | f     | f     | f    | f     | f     | f
   5 | {-1,2,3}    | f      | f      | f    | f   | f    | f     | f     | f    | f     | f     | f
   6 | NULL        | NULL   | t --!! | NULL | f   | NULL | NULL  | NULL  | NULL | f     | f     | NULL

需要注意的是array_remove(),并array_position()没有允许多维数组.右侧的所有表达式t_93a仅适用于1维数组.

这个SQL Fiddle中的更多测试(对于Postgres 9.3).

基准设置

增加的时间来自Postgres 9.5中的20万行基准测试.这是我的设置:

CREATE TEMP TABLE t AS
SELECT row_number() OVER() AS num
     , array_agg(elem) AS ar
     , bool_or(elem IS NULL) AS expected
FROM  (
   SELECT CASE WHEN random() > .95 THEN NULL ELSE g END AS elem  -- 5% NULL VALUES
        , count(*) FILTER (WHERE random() > .8)
                   OVER (ORDER BY g) AS grp  -- avg 5 element per array
   FROM   generate_series (1, 1000000) g  -- increase for big test case
   ) sub
GROUP  BY grp;
Run Code Online (Sandbox Code Playgroud)

功能包装器

为了重复使用,我会在Postgres 9.5中创建一个函数,如下所示:

CREATE OR REPLACE FUNCTION f_array_has_null (anyarray)
  RETURNS bool LANGUAGE sql IMMUTABLE AS
 'SELECT array_position($1, NULL) IS NOT NULL';
Run Code Online (Sandbox Code Playgroud)

使用多态输入类型,这适用于任何数组类型,而不仅仅是int[].

使其IMMUTABLE允许性能优化和索引表达式.

但是不要这样做STRICT,这将禁用"功能内联"并削弱性能.

如果您需要捕获案例ar IS NULL,而不是STRICT使用该功能,请使用:

CREATE OR REPLACE FUNCTION f_array_has_null (anyarray)
  RETURNS bool LANGUAGE sql IMMUTABLE AS
 'SELECT CASE WHEN $1 IS NOT NULL
              THEN array_position($1, NULL) IS NOT NULL END';
Run Code Online (Sandbox Code Playgroud)

对于Postgres 9.1,请使用t_91上面的表达式.其余的应用不变.


密切相关的问题: