San*_*nda 93 postgresql grep string-matching
是否可以在PostgreSQL中搜索每个表的每一列中的特定值?
Oracle 提供了类似的问题.
Mik*_*ll' 117
如何转储数据库的内容,然后使用grep?
$ pg_dump --data-only --inserts -U postgres your-db-name > a.tmp
$ grep United a.tmp
INSERT INTO countries VALUES ('US', 'United States');
INSERT INTO countries VALUES ('GB', 'United Kingdom');
Run Code Online (Sandbox Code Playgroud)
相同的实用程序pg_dump可以在输出中包含列名.只需--inserts改为--column-inserts.这样,您也可以搜索特定的列名称.但如果我在寻找列名,我可能会转储模式而不是数据.
$ pg_dump --data-only --column-inserts -U postgres your-db-name > a.tmp
$ grep country_code a.tmp
INSERT INTO countries (iso_country_code, iso_country_name) VALUES ('US', 'United States');
INSERT INTO countries (iso_country_code, iso_country_name) VALUES ('GB', 'United Kingdom');
Run Code Online (Sandbox Code Playgroud)
Dan*_*ité 59
这是一个pl/pgsql函数,用于查找任何列包含特定值的记录.它以文本格式搜索值,要搜索的表名数组(默认为所有表)和模式名称数组(默认为所有模式名称)作为参数.
它返回一个表结构,其中包含模式,表名,列名和伪列ctid(表中行的非持久物理位置,请参阅系统列)
CREATE OR REPLACE FUNCTION search_columns(
needle text,
haystack_tables name[] default '{}',
haystack_schema name[] default '{}'
)
RETURNS table(schemaname text, tablename text, columnname text, rowctid text)
AS $$
begin
FOR schemaname,tablename,columnname IN
SELECT c.table_schema,c.table_name,c.column_name
FROM information_schema.columns c
JOIN information_schema.tables t ON
(t.table_name=c.table_name AND t.table_schema=c.table_schema)
WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}')
AND (c.table_schema=ANY(haystack_schema) OR haystack_schema='{}')
AND t.table_type='BASE TABLE'
LOOP
EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L',
schemaname,
tablename,
columnname,
needle
) INTO rowctid;
IF rowctid is not null THEN
RETURN NEXT;
END IF;
END LOOP;
END;
$$ language plpgsql;
Run Code Online (Sandbox Code Playgroud)
编辑:此代码适用于PG 9.1或更高版本.此外,您可能希望github上的版本基于相同的原则,但增加了一些速度和报告改进.
在测试数据库中使用的示例:
在公共模式中的所有表中搜索:
select * from search_columns('foobar');
schemaname | tablename | columnname | rowctid
------------+-----------+------------+---------
public | s3 | usename | (0,11)
public | s2 | relname | (7,29)
public | w | body | (0,2)
(3 rows)
在特定表格中搜索:
select * from search_columns('foobar','{w}');
schemaname | tablename | columnname | rowctid
------------+-----------+------------+---------
public | w | body | (0,2)
(1 row)
搜索从select中获得的表的子集:
select * from search_columns('foobar', array(select table_name::name from information_schema.tables where table_name like 's%'), array['public']);
schemaname | tablename | columnname | rowctid
------------+-----------+------------+---------
public | s2 | relname | (7,29)
public | s3 | usename | (0,11)
(2 rows)
获取带有相应基表和ctid的结果行:
select * from public.w where ctid='(0,2)'; title | body | tsv -------+--------+--------------------- toto | foobar | 'foobar':2 'toto':1
要再次测试正则表达式而不是严格相等,例如grep,这个:
CREATE OR REPLACE FUNCTION search_columns(
needle text,
haystack_tables name[] default '{}',
haystack_schema name[] default '{}'
)
RETURNS table(schemaname text, tablename text, columnname text, rowctid text)
AS $$
begin
FOR schemaname,tablename,columnname IN
SELECT c.table_schema,c.table_name,c.column_name
FROM information_schema.columns c
JOIN information_schema.tables t ON
(t.table_name=c.table_name AND t.table_schema=c.table_schema)
WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}')
AND (c.table_schema=ANY(haystack_schema) OR haystack_schema='{}')
AND t.table_type='BASE TABLE'
LOOP
EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L',
schemaname,
tablename,
columnname,
needle
) INTO rowctid;
IF rowctid is not null THEN
RETURN NEXT;
END IF;
END LOOP;
END;
$$ language plpgsql;
Run Code Online (Sandbox Code Playgroud)
可能会改为:
select * from search_columns('foobar');
schemaname | tablename | columnname | rowctid
------------+-----------+------------+---------
public | s3 | usename | (0,11)
public | s2 | relname | (7,29)
public | w | body | (0,2)
(3 rows)
a_h*_*ame 10
我知道唯一能做到这一点的工具是:SQL Workbench/J:http://www.sql-workbench.net/
基于Java/JDBC的工具,它提供了一个特殊的(专有的)SQL"命令"来搜索数据库中的所有(或刚刚选择的)表:
http://www.sql-workbench.eu/manual/wb-commands.html#command-search-data
http://www.sql-workbench.eu/wbgrepdata_png.html
Erw*_*ter 10
在每个表的每一列中搜索特定值
这并没有定义如何精确匹配。
它也没有定义准确返回什么。
假设:
regclass) 和元组 ID ( ctid),因为这是最简单的。这是一种非常简单、快速且略显脏乱的方法:
CREATE OR REPLACE FUNCTION search_whole_db(_like_pattern text)
RETURNS TABLE(_tbl regclass, _ctid tid) AS
$func$
BEGIN
FOR _tbl IN
SELECT c.oid::regclass
FROM pg_class c
JOIN pg_namespace n ON n.oid = relnamespace
WHERE c.relkind = 'r' -- only tables
AND n.nspname !~ '^(pg_|information_schema)' -- exclude system schemas
ORDER BY n.nspname, c.relname
LOOP
RETURN QUERY EXECUTE format(
'SELECT $1, ctid FROM %s t WHERE t::text ~~ %L'
, _tbl, '%' || _like_pattern || '%')
USING _tbl;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Run Code Online (Sandbox Code Playgroud)
称呼:
SELECT * FROM search_whole_db('mypattern');
Run Code Online (Sandbox Code Playgroud)
提供不包含%.
为什么有点脏?
如果text表示中行的分隔符和装饰器可以是搜索模式的一部分,则可能会出现误报:
,默认()"\ 可以添加为转义字符并且某些列的文本表示可能取决于本地设置 - 但这种歧义是问题所固有的,而不是我的解决方案。
每个符合条件的行仅返回一次,即使它匹配多次(与此处的其他答案相反)。
这将搜索除系统目录之外的整个数据库。通常需要很长时间才能完成。您可能希望限制某些模式/表(甚至列),如其他答案中所示。或者添加通知和进度指示器,也在另一个答案中演示。
的regclass对象标识符类型被表示为表名,模式限定在必要时根据当前的歧义search_path:
是什么ctid?
您可能希望在搜索模式中转义具有特殊含义的字符。看:
小智 5
如果有人认为它可以提供帮助。这是@Daniel Vérité 的函数,另一个参数接受可用于搜索的列名称。这样就减少了处理时间。至少在我的测试中它减少了很多。
CREATE OR REPLACE FUNCTION search_columns(
needle text,
haystack_columns name[] default '{}',
haystack_tables name[] default '{}',
haystack_schema name[] default '{public}'
)
RETURNS table(schemaname text, tablename text, columnname text, rowctid text)
AS $$
begin
FOR schemaname,tablename,columnname IN
SELECT c.table_schema,c.table_name,c.column_name
FROM information_schema.columns c
JOIN information_schema.tables t ON
(t.table_name=c.table_name AND t.table_schema=c.table_schema)
WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}')
AND c.table_schema=ANY(haystack_schema)
AND (c.column_name=ANY(haystack_columns) OR haystack_columns='{}')
AND t.table_type='BASE TABLE'
LOOP
EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L',
schemaname,
tablename,
columnname,
needle
) INTO rowctid;
IF rowctid is not null THEN
RETURN NEXT;
END IF;
END LOOP;
END;
$$ language plpgsql;
Run Code Online (Sandbox Code Playgroud)
Bellow 是上面创建的 search_function 的使用示例。
SELECT * FROM search_columns('86192700'
, array(SELECT DISTINCT a.column_name::name FROM information_schema.columns AS a
INNER JOIN information_schema.tables as b ON (b.table_catalog = a.table_catalog AND b.table_schema = a.table_schema AND b.table_name = a.table_name)
WHERE
a.column_name iLIKE '%cep%'
AND b.table_type = 'BASE TABLE'
AND b.table_schema = 'public'
)
, array(SELECT b.table_name::name FROM information_schema.columns AS a
INNER JOIN information_schema.tables as b ON (b.table_catalog = a.table_catalog AND b.table_schema = a.table_schema AND b.table_name = a.table_name)
WHERE
a.column_name iLIKE '%cep%'
AND b.table_type = 'BASE TABLE'
AND b.table_schema = 'public')
);
Run Code Online (Sandbox Code Playgroud)
在不存储新过程的情况下,您可以使用代码块并执行以获取出现表。您可以按架构、表或列名称过滤结果。
DO $$
DECLARE
value int := 0;
sql text := 'The constructed select statement';
rec1 record;
rec2 record;
BEGIN
DROP TABLE IF EXISTS _x;
CREATE TEMPORARY TABLE _x (
schema_name text,
table_name text,
column_name text,
found text
);
FOR rec1 IN
SELECT table_schema, table_name, column_name
FROM information_schema.columns
WHERE table_name <> '_x'
AND UPPER(column_name) LIKE UPPER('%%')
AND table_schema <> 'pg_catalog'
AND table_schema <> 'information_schema'
AND data_type IN ('character varying', 'text', 'character', 'char', 'varchar')
LOOP
sql := concat('SELECT ', rec1."column_name", ' AS "found" FROM ',rec1."table_schema" , '.',rec1."table_name" , ' WHERE UPPER(',rec1."column_name" , ') LIKE UPPER(''','%my_substring_to_find_goes_here%' , ''')');
RAISE NOTICE '%', sql;
BEGIN
FOR rec2 IN EXECUTE sql LOOP
RAISE NOTICE '%', sql;
INSERT INTO _x VALUES (rec1."table_schema", rec1."table_name", rec1."column_name", rec2."found");
END LOOP;
EXCEPTION WHEN OTHERS THEN
END;
END LOOP;
END; $$;
SELECT * FROM _x;
Run Code Online (Sandbox Code Playgroud)
有一种方法可以在不创建函数或使用外部工具的情况下实现这一点。通过使用query_to_xml()可以在另一个查询中动态运行查询的Postgres函数,可以在多个表中搜索文本。这是基于我检索所有表的行数的答案:
要foo在模式中的所有表中搜索字符串,可以使用以下命令:
with found_rows as (
select format('%I.%I', table_schema, table_name) as table_name,
query_to_xml(format('select to_jsonb(t) as table_row
from %I.%I as t
where t::text like ''%%foo%%'' ', table_schema, table_name),
true, false, '') as table_rows
from information_schema.tables
where table_schema = 'public'
)
select table_name, x.table_row
from found_rows f
left join xmltable('//table/row'
passing table_rows
columns
table_row text path 'table_row') as x on true
Run Code Online (Sandbox Code Playgroud)
请注意,使用xmltable需要 Postgres 10 或更新版本。对于较旧的 Postgres 版本,这也可以使用 xpath() 来完成。
with found_rows as (
select format('%I.%I', table_schema, table_name) as table_name,
query_to_xml(format('select to_jsonb(t) as table_row
from %I.%I as t
where t::text like ''%%foo%%'' ', table_schema, table_name),
true, false, '') as table_rows
from information_schema.tables
where table_schema = 'public'
)
select table_name, x.table_row
from found_rows f
cross join unnest(xpath('/table/row/table_row/text()', table_rows)) as r(data)
Run Code Online (Sandbox Code Playgroud)
公用表表达式 ( WITH ...) 只是为了方便而使用。它遍历public模式中的所有表。对于每个表,通过query_to_xml()函数运行以下查询:
select to_jsonb(t)
from some_table t
where t::text like '%foo%';
Run Code Online (Sandbox Code Playgroud)
where 子句用于确保仅对包含搜索字符串的行执行昂贵的 XML 内容生成。这可能会返回如下内容:
with found_rows as (
select format('%I.%I', table_schema, table_name) as table_name,
query_to_xml(format('select to_jsonb(t) as table_row
from %I.%I as t
where t::text like ''%%foo%%'' ', table_schema, table_name),
true, false, '') as table_rows
from information_schema.tables
where table_schema = 'public'
)
select table_name, x.table_row
from found_rows f
left join xmltable('//table/row'
passing table_rows
columns
table_row text path 'table_row') as x on true
Run Code Online (Sandbox Code Playgroud)
完整行的转换jsonb完成,以便在结果中可以看到哪个值属于哪个列。
以上可能会返回如下内容:
with found_rows as (
select format('%I.%I', table_schema, table_name) as table_name,
query_to_xml(format('select to_jsonb(t) as table_row
from %I.%I as t
where t::text like ''%%foo%%'' ', table_schema, table_name),
true, false, '') as table_rows
from information_schema.tables
where table_schema = 'public'
)
select table_name, x.table_row
from found_rows f
cross join unnest(xpath('/table/row/table_row/text()', table_rows)) as r(data)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
99099 次 |
| 最近记录: |