如何在redshift中查找数据库,模式,表的大小

use*_*784 42 amazon-web-services amazon-redshift

球队,

我的红移版本是:

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.735
Run Code Online (Sandbox Code Playgroud)

如何找出数据库大小,表空间,架构大小和表大小?

但以下不适用于红移(适用于以上版本)

SELECT pg_database_size('db_name');
SELECT pg_size_pretty( pg_relation_size('table_name') );
Run Code Online (Sandbox Code Playgroud)

有没有找到像oracle的替代品(来自DBA_SEGMENTS)

对于tble size,我有以下查询,但不确定MBYTES的确切分析.对于第3行,MBYTES = 372.它意味着372 MB?

select trim(pgdb.datname) as Database, trim(pgn.nspname) as Schema,
trim(a.name) as Table, b.mbytes, a.rows
from ( select db_id, id, name, sum(rows) as rows from stv_tbl_perm a group by db_id, id, name ) as a
join pg_class as pgc on pgc.oid = a.id
join pg_namespace as pgn on pgn.oid = pgc.relnamespace
join pg_database as pgdb on pgdb.oid = a.db_id
join (select tbl, count(*) as mbytes
from stv_blocklist group by tbl) b on a.id=b.tbl
order by a.db_id, a.name;
   database    |    schema    |      table       | mbytes |   rows
---------------+--------------+------------------+--------+----------
      postgres | public       | company          |      8 |        1
      postgres | public       | table_data1_1    |      7 |        1
      postgres | proj_schema1 | table_data1    |    372 | 33867540
      postgres | public       | table_data1_2    |     40 |  2000001

(4 rows)
Run Code Online (Sandbox Code Playgroud)

imc*_*nzl 60

上述答案并不总能为使用的表空间提供正确的答案.AWS支持已使此查询使用:

SELECT   TRIM(pgdb.datname) AS Database,
         TRIM(a.name) AS Table,
         ((b.mbytes/part.total::decimal)*100)::decimal(5,2) AS pct_of_total,
         b.mbytes,
         b.unsorted_mbytes
FROM     stv_tbl_perm a
JOIN     pg_database AS pgdb
  ON     pgdb.oid = a.db_id
JOIN     ( SELECT   tbl,
                    SUM( DECODE(unsorted, 1, 1, 0)) AS unsorted_mbytes,
                    COUNT(*) AS mbytes
           FROM     stv_blocklist
           GROUP BY tbl ) AS b
       ON a.id = b.tbl
JOIN     ( SELECT SUM(capacity) AS total
           FROM   stv_partitions
           WHERE  part_begin = 0 ) AS part
      ON 1 = 1
WHERE    a.slice = 0
ORDER BY 4 desc, db_id, name;
Run Code Online (Sandbox Code Playgroud)

  • 这个查询是否只在一个切片上过滤?`WHERE a.slice = 0` (4认同)
  • @imcdnzl:什么是unsorted_mbytes?当你计算总内存时,你需要总结mbytes和unsorted_mybytes吗? (2认同)

mik*_*pdb 20

是的,你的例子中的mbytes是372Mb.这是我一直在使用的:

select
  cast(use2.usename as varchar(50)) as owner, 
  pgc.oid,
  trim(pgdb.datname) as Database,
  trim(pgn.nspname) as Schema,
  trim(a.name) as Table,
  b.mbytes,
  a.rows
from 
 (select db_id, id, name, sum(rows) as rows
  from stv_tbl_perm a
  group by db_id, id, name
  ) as a
 join pg_class as pgc on pgc.oid = a.id
 left join pg_user use2 on (pgc.relowner = use2.usesysid)
 join pg_namespace as pgn on pgn.oid = pgc.relnamespace 
    and pgn.nspowner > 1
 join pg_database as pgdb on pgdb.oid = a.db_id
 join 
   (select tbl, count(*) as mbytes
    from stv_blocklist
    group by tbl
   ) b on a.id = b.tbl
 order by mbytes desc, a.db_id, a.name; 
Run Code Online (Sandbox Code Playgroud)

  • “pgn.nspowner > 1”过滤掉了公共模式,这可能就是 @SandipPingle 也没有取回行的原因。 (2认同)
  • 最佳答案!! (2认同)

gat*_*ado 11

我不确定按数据库和方案进行分组,但这是一个很简单的方法来获取表,

SELECT tbl, name, size_mb FROM
(
  SELECT tbl, count(*) AS size_mb
  FROM stv_blocklist
  GROUP BY tbl
)
LEFT JOIN
(select distinct id, name FROM stv_tbl_perm)
ON id = tbl
ORDER BY size_mb DESC
LIMIT 10;
Run Code Online (Sandbox Code Playgroud)


ker*_*elp 7

你可以查看这个存储库,我相信你会在那里找到有用的东西.

https://github.com/awslabs/amazon-redshift-utils

要回答您的问题,您可以使用此视图:https: //github.com/awslabs/amazon-redshift-utils/blob/master/src/AdminViews/v_space_used_per_tbl.sql

然后根据需要进行查询.例如:select * from admin.v_space_used_per_tbl;


小智 7

其他答案之一的修改版本.这包括数据库名称,模式名称,表名称,总行数,磁盘大小和未排序大小:

-- sort by row count
select trim(pgdb.datname) as Database, trim(pgns.nspname) as Schema, trim(a.name) as Table,
    c.rows, ((b.mbytes/part.total::decimal)*100)::decimal(5,3) as pct_of_total, b.mbytes, b.unsorted_mbytes
    from stv_tbl_perm a
    join pg_class as pgtbl on pgtbl.oid = a.id
    join pg_namespace as pgns on pgns.oid = pgtbl.relnamespace
    join pg_database as pgdb on pgdb.oid = a.db_id
    join (select tbl, sum(decode(unsorted, 1, 1, 0)) as unsorted_mbytes, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
    join (select id, sum(rows) as rows from stv_tbl_perm group by id) c on a.id=c.id
    join (select sum(capacity) as total from stv_partitions where part_begin=0) as part on 1=1
    where a.slice=0
    order by 4 desc, db_id, name;


-- sort by space used
select trim(pgdb.datname) as Database, trim(pgns.nspname) as Schema, trim(a.name) as Table,
    c.rows, ((b.mbytes/part.total::decimal)*100)::decimal(5,3) as pct_of_total, b.mbytes, b.unsorted_mbytes
    from stv_tbl_perm a
    join pg_class as pgtbl on pgtbl.oid = a.id
    join pg_namespace as pgns on pgns.oid = pgtbl.relnamespace
    join pg_database as pgdb on pgdb.oid = a.db_id
    join (select tbl, sum(decode(unsorted, 1, 1, 0)) as unsorted_mbytes, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
    join (select id, sum(rows) as rows from stv_tbl_perm group by id) c on a.id=c.id
    join (select sum(capacity) as total from stv_partitions where part_begin=0) as part on 1=1
    where a.slice=0
    order by 6 desc, db_id, name;
Run Code Online (Sandbox Code Playgroud)


Vzz*_*arr 7

SVV_TABLE_INFO是一个 Redshift 系统表,显示有关 Redshift 数据库中用户定义表(而非其他系统表)的信息。该表仅对超级用户可见。

要获取每个表的大小,请在 Redshift 集群上运行以下命令:

SELECT "table", size, tbl_rows 
FROM SVV_TABLE_INFO
Run Code Online (Sandbox Code Playgroud)
  • table列是表名。
  • size列是表的大小(以 MB 为单位)。
  • tbl_rows列是表中的总行数,包括已标记为删除但尚未清理的行。

来源

请参阅SVV_TABLE_INFORedshift 文档,了解要从此系统表中检索的其他有趣列。


小智 5

这个查询要容易得多:

-- 列出集群中最大的 30 个表

SELECT 
 "schema"
,"table"  AS table_name
,ROUND((size/1024.0),2) AS "Size in Gigabytes"
,pct_used AS "Physical Disk Used by This Table"
FROM svv_table_info
ORDER BY pct_used DESC
LIMIT 30;
Run Code Online (Sandbox Code Playgroud)