use*_*289 2 postgresql count postgresql-9.6
我有一个由供应商提供的数据的大表(我无法对其进行太多更改),大约有 315 列。我怀疑许多列没有被使用(或者至少不一致)。
我想要一个查询,它可以为我提供表中每列值的计数。
例如
CREATE TABLE foo AS VALUES
( null , 'xyz' , 'pdq' , null ),
( 'abc' , 'def' , 'ghj' , null ),
( 'hsh' , 'fff' , 'oko' , null );
Run Code Online (Sandbox Code Playgroud)
所以这会产生类似的结果:
Col1 | 2
Col2 | 3
Col3 | 3
Col4 | 0
Run Code Online (Sandbox Code Playgroud)
编辑:澄清一下,我知道我可以使用,COUNT但我希望有一种方法可以首先循环遍历系统表的查询,以避免必须手动编写 315 count 语句。谢谢!
就像是
FOR column_names IN SELECT * FROM information_schema.columns WHERE
table_schema = 'public' AND table_name = 'vendor'
LOOP
RAISE NOTICE 'doing %s', quote_ident(column_names.column_name);
SELECT count(column_names.column_name) from vendor
END LOOP;
Run Code Online (Sandbox Code Playgroud)
您可以像这样轻松完成第一部分,
SELECT FORMAT(
E'SELECT %s\nFROM %I.%I.%I;' -- query template
, string_agg( -- generate the select list for query template
FORMAT('count(DISTINCT %I) AS %I', column_name, column_name)
, E',\n\t'
),
table_catalog, -- not strictly required, but future safe
table_schema,
table_name
)
FROM information_schema.columns
WHERE table_name = 'foo'
GROUP BY table_catalog, table_schema, table_name;
Run Code Online (Sandbox Code Playgroud)
这将返回这样的查询,
SELECT count(DISTINCT column1) AS column1,
count(DISTINCT column2) AS column2,
count(DISTINCT column3) AS column3,
count(DISTINCT column4) AS column4
FROM ecarroll.public.foo;
Run Code Online (Sandbox Code Playgroud)
这几乎就是您想要的,只是您需要对其进行调整。
column1 | column2 | column3 | column4
---------+---------+---------+---------
2 | 3 | 3 | 0
Run Code Online (Sandbox Code Playgroud)
为了做到这一点,我们可以使用unnest(ARRAY[cols]) AS col_name,所以我们本质上必须生成
像这样,
SELECT FORMAT(
$$
SELECT ordinality AS column_number, distinct_values -- the col#, and count
FROM (
SELECT %s -- This was the query we
FROM %I.%I.%I -- used previously
) AS t
CROSS JOIN unnest(ARRAY[%s]) WITH ORDINALITY -- Here we use unnest(array)
AS distinct_values; -- to pivot the table
$$,
string_agg(
FORMAT('count(DISTINCT %I) AS %I', column_name, column_name)
, E',\n\t'
),
table_catalog,
table_schema,
table_name,
string_agg(column_name, ', ')
)
FROM information_schema.columns
WHERE table_name = 'foo'
GROUP BY table_catalog, table_schema, table_name;
Run Code Online (Sandbox Code Playgroud)
返回这样的查询..
SELECT ordinality AS column_number, distinct_values
FROM (
SELECT count(DISTINCT column1) AS column1,
count(DISTINCT column2) AS column2,
count(DISTINCT column3) AS column3,
count(DISTINCT column4) AS column4
FROM ecarroll.public.foo
) AS t
CROSS JOIN unnest(ARRAY[column1, column2, column3, column4]) WITH ORDINALITY
AS distinct_values;
Run Code Online (Sandbox Code Playgroud)
你只要跑\gexec,你就会得到,
column_number | distinct_values
---------------+-----------------
1 | 2
2 | 3
3 | 3
4 | 0
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4282 次 |
| 最近记录: |