无法使用 Redshift 目录查询的输出

Nic*_*las 5 amazon-redshift

我在处理针对 Redshift 目录表的查询时遇到了各种问题。

为了说明,以下工作:

select "table_name"::text as "table"
from "information_schema"."tables"
where table_schema not like 'pg_%' and table_schema != 'information_schema'
Run Code Online (Sandbox Code Playgroud)

和以下作品:

create view works as 
select "table_name"::text as "table"
from "information_schema"."tables"
where table_schema not like 'pg_%' and table_schema != 'information_schema'
Run Code Online (Sandbox Code Playgroud)

但以下失败:

create table fails as
select "table_name"::text as "table"
from "information_schema"."tables"
where table_schema not like 'pg_%' and table_schema != 'information_schema'
Run Code Online (Sandbox Code Playgroud)

和:

[SQL]create table fails as
select "table_name"::text as "table"
from "information_schema"."tables"
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
[Err] ERROR:  Specified types or functions (one per INFO message) not supported on Redshift tables.
Run Code Online (Sandbox Code Playgroud)

http://docs.aws.amazon.com/redshift/latest/dg/c_join_PG.html我读到

If you write a join query that explicitly or implicitly references a column that has an unsupported data type, the query returns an error.

这是否意味着在基于对目录表的选择的创建表中(即使我将奇怪的字段类型转换为文本),在引擎盖下 Redshift 正在执行连接和奇怪的事情,这意味着我不能这样做?

创建表是问题的一种表现。另一个是我无法卸载视图或基于目录查询的任何内容。例如,以下也将失败并显示与上述类似的错误消息。

unload ('select * from "works"') to 's3://etc'
Run Code Online (Sandbox Code Playgroud)

目前看来,我可以处理这些数据的唯一方法是从外部程序发出查询,然后让该外部程序手动将结果集写回表中。即它不能从数据库内完成。

有人有其他解决方案吗?

小智 5

我遇到了类似的问题,不确定原因的详细信息,但找到了解决方法。

不要在 information_schema 中查找值,而是尝试在 pg_catalog 表中查找关系和属性名称。

例如,以下查询提供特定表的列名:

SELECT attname::text FROM pg_attribute WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = '<your_table_name>') AND attname NOT IN ('insertxid', 'deletexid', 'oid', 'tableoid', 'xmin', 'cmin', 'xmax', 'cmax', 'ctid');
Run Code Online (Sandbox Code Playgroud)

此查询可用于 CREATE TABLE 语句:

CREATE TABLE consumer_person_dated_attr_types AS
SELECT attname::text FROM pg_attribute 
WHERE attrelid = (SELECT oid FROM pg_class 
    WHERE relname = '<your_table>') AND attname NOT IN ('oid', 'tableoid', 'xmin', 'cmin', 'xmax', 'cmax', 'ctid'
);
Run Code Online (Sandbox Code Playgroud)

类似地,以下查询创建一个表,其中一列用于表名,另一列用于模式名:

CREATE TABLE tmp_table_names AS
SELECT relname::text, nspname::text
FROM pg_class c
JOIN pg_namespace n
ON n.oid = c.relnamespace
WHERE nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema');
Run Code Online (Sandbox Code Playgroud)

请注意,目录表提供的系统级详细信息比 information_schema 多得多。例如,每个表都有由上述查询返回的内部系统列,因此如果您只需要 DDL 中定义的列的列名,则需要排除内部系统列。除了此处列出的列之外,RedShift 还会从上述查询中返回 deletexid 和 insertxid,因此也应排除这些列。对表列表的查询也是如此(即返回了许多系统模式)。

我怀疑这与列的数据类型有关。information_schema 中许多列的数据类型是 'sql_identifier',JDBC 类型为 'OTHER'(在 SQLWorkbenchJ 中查看时),而类似列的 pg_catalog 表的数据类型为 'name' 和 JDBC 类型为 'VARCHAR'。