检查字符类型列中是否存在空字符串

Mar*_*tus 6 postgresql database-design dynamic-sql plpgsql postgresql-9.1

我有一个应用程序(作为其逻辑的一部分)在插入数据库之前修剪字符串并用 NULL 替换空字符串。我想确保强制执行的一种方法是在每个具有VARCHAR, TEXT(或类似)列的表上编写一个 CHECK 。

假设人们不能或不想这样做,有没有一种方法可以编写一个简单的通用 SQL 查询(从数据库的元数据中获取表名和列名)来检查数据库中是否有任何文本列包含空字符串?

Erw*_*ter 4

单表功能

\n

返回给定表的所有字符类型列,其中包含空值 ( ) 计数\'\'以及它们是否已定义NOT NULL

\n
CREATE OR REPLACE FUNCTION f_tbl_empty_status(_tbl regclass)\n  RETURNS TABLE (tbl text, col text, empty_ct bigint, not_null bool)\n  LANGUAGE plpgsql AS\n$func$\nDECLARE\n   _typ      CONSTANT regtype[] := \'{text, bpchar, varchar}\';  -- base char types\n   _sql      text;\n   _col_arr  text[];\n   _null_arr bool[];\nBEGIN\n   -- Build command\n   SELECT INTO _col_arr, _null_arr, _sql\n          array_agg(s.col)\n        , array_agg(s.attnotnull)\n        , \'\n   SELECT $1\n        , unnest($2)\n        , unnest(ARRAY [count(\'\n                 || string_agg(s.col, \' = \'\'\'\' OR NULL), count(\')\n                                   || \' = \'\'\'\' OR NULL)])\n        , unnest($3)\n   FROM   \' || _tbl\n   FROM  (\n      SELECT quote_ident(attname) AS col, attnotnull\n      FROM   pg_attribute\n      WHERE  attrelid = _tbl           -- valid, visible, legal table name \n      AND    attnum >= 1               -- exclude tableoid & friends\n      AND    NOT attisdropped          -- exclude dropped columns\n   -- AND    NOT attnotnull            -- include columns defined NOT NULL\n      AND    atttypid = ANY(_typ)      -- only character types\n      ORDER  BY attnum\n      ) AS s;\n\n   -- Debug\n   -- RAISE NOTICE \'%\', _sql;\n\n   -- Execute\n   IF _sql IS NULL THEN\n      -- do nothing, nothing to return\n   ELSE\n      RETURN QUERY EXECUTE _sql\n      USING  _tbl::text, _col_arr, _null_arr;\n   END IF;\nEND\n$func$;\n
Run Code Online (Sandbox Code Playgroud)\n

称呼:

\n
SELECT * FROM f_tbl_empty_status(\'tbl_name\'); -- optionally schema-qualified\n
Run Code Online (Sandbox Code Playgroud)\n

返回:

\n
tbl   | col        | empty_ct | not_null\n------+------------+----------+---------\ntbl1  | txt        | 0        | f\ntbl1  | vc         | 3        | f\ntbl1  | "oDD name" | 7        | f\n
Run Code Online (Sandbox Code Playgroud)\n

适用于Postgres 9.1或更高版本。

\n

如果需要,根据当前的search_path.

\n

如果需要,输出表名和列名会自动转义。

\n

empty_ct是列值为空字符串的行数

\n

not_null报告该列是否已定义NOT NULL(因此您不能将可能的空字符串转换为 NULL!)

\n

输入表名称可以选择是模式限定的,否则默认为当前的search_path.

\n

您的角色需要具有实际读取给定表的权限。

\n

该函数经过高度优化,仅对给定表运行一次扫描以检查所有相关列。

\n

应该可以安全地防止 SQL 注入。

\n

使用并行unnest()来稍微简化复杂的代码:

\n\n

您会对这个相关答案感兴趣,以实际替换空字符串- 并附有更多解释

\n\n

也匹配仅包含空格字符的字符串

\n

正如您所评论的trim(s.col, \' \') = \'\'工作做得很好。但这里有一个捷径:

\n
s.col::char = \'\'\n
Run Code Online (Sandbox Code Playgroud)\n

如何?
\nchar是 的别名character(1),很少有用的空白填充类型。值用空格字符填充到右侧直到长度说明符(1在本例中是这样,但不相关)。对于这种类型来说,尾随空格实际上是微不足道的。所以与或\' \'相同。瞧\xc3\xa1。是的,我测试过,它也更快。\'\'\' \'

\n

要查找仅包含空格字符的字符串(而不是其他空格!),请将强制转换添加到上面的这些行中:

\n
                  || string_agg(s.col, \'::char = \'\'\'\' OR NULL), count(\')\n                                    || \'::char = \'\'\'\' OR NULL)])
Run Code Online (Sandbox Code Playgroud)\n

用于报告整个模式的包装函数

\n
CREATE OR REPLACE FUNCTION f_schema_empty_status(_sch text DEFAULT \'public\')\n  RETURNS TABLE (tbl text, col text, empty_ct bigint, not_null bool)\n  LANGUAGE plpgsql AS\n$func$\nDECLARE \n   _tbl regclass;\nBEGIN\n   FOR _tbl IN\n      SELECT c.oid\n      FROM   pg_class c\n      JOIN   pg_namespace n ON n.oid = c.relnamespace\n      WHERE  n.nspname = _sch  -- \'public\' by default\n   -- AND    c.relname LIKE \'my_pattern%\'  -- optionally filter table names\n      AND    c.relkind = \'r\'  -- regular tables only\n      ORDER  BY relname\n   LOOP\n   -- Debug\n   -- RAISE NOTICE \'table: %\', _tbl;\n\n      RETURN QUERY\n      SELECT * FROM f_tbl_empty_status(_tbl);\n   END LOOP;\nEND\n$func$;\n
Run Code Online (Sandbox Code Playgroud)\n

称呼:

\n
SELECT * FROM f_schema_empty_status();  -- defaults to \'public\' without parameter\n
Run Code Online (Sandbox Code Playgroud)\n

返回:

\n
CREATE OR REPLACE FUNCTION f_tbl_empty_status(_tbl regclass)\n  RETURNS TABLE (tbl text, col text, empty_ct bigint, not_null bool)\n  LANGUAGE plpgsql AS\n$func$\nDECLARE\n   _typ      CONSTANT regtype[] := \'{text, bpchar, varchar}\';  -- base char types\n   _sql      text;\n   _col_arr  text[];\n   _null_arr bool[];\nBEGIN\n   -- Build command\n   SELECT INTO _col_arr, _null_arr, _sql\n          array_agg(s.col)\n        , array_agg(s.attnotnull)\n        , \'\n   SELECT $1\n        , unnest($2)\n        , unnest(ARRAY [count(\'\n                 || string_agg(s.col, \' = \'\'\'\' OR NULL), count(\')\n                                   || \' = \'\'\'\' OR NULL)])\n        , unnest($3)\n   FROM   \' || _tbl\n   FROM  (\n      SELECT quote_ident(attname) AS col, attnotnull\n      FROM   pg_attribute\n      WHERE  attrelid = _tbl           -- valid, visible, legal table name \n      AND    attnum >= 1               -- exclude tableoid & friends\n      AND    NOT attisdropped          -- exclude dropped columns\n   -- AND    NOT attnotnull            -- include columns defined NOT NULL\n      AND    atttypid = ANY(_typ)      -- only character types\n      ORDER  BY attnum\n      ) AS s;\n\n   -- Debug\n   -- RAISE NOTICE \'%\', _sql;\n\n   -- Execute\n   IF _sql IS NULL THEN\n      -- do nothing, nothing to return\n   ELSE\n      RETURN QUERY EXECUTE _sql\n      USING  _tbl::text, _col_arr, _null_arr;\n   END IF;\nEND\n$func$;\n
Run Code Online (Sandbox Code Playgroud)\n

db<>在这里小提琴
\n旧的sqlfiddle

\n