art*_*hur 6 index postgresql-9.6 index-bloat
我试图了解在大量插入和删除后删除对表和索引膨胀的影响。插入和删除遵循相当严格的模式:首先,记录按顺序插入,在下一个时间段(通常为五年)内很少(或几乎从不)更新,一旦超过此阈值(五年)就删除。这是许多企业软件系统中的典型场景,出于合规性原因需要保留记录。
数据库 (PostgreSQL 9.6.5) 以非常标准的配置运行(增加了一些参数以进行快速查询和维护处理)。
我正在尝试模拟和分析表和(典型)索引的膨胀。在概念层面上,(1)创建一个表,(2)分析它(3)插入记录(4)再次分析它,(5)删除一半的记录(6)分析它(7)再次分析它,( 8) 再次插入已删除记录的数量 (9) 分析它并 (10) 再次分析它和 (11) 检查索引和表是否膨胀。
完整的SQL代码如下:
set LC_MESSAGES ='C'
create extension "uuid-ossp"
drop table v1;
create table v1 (
id serial primary key,
id_uuid_v1 uuid default uuid_generate_v1(),
id_uuid_v4 uuid default uuid_generate_v4(),
t timestamp with time zone default clock_timestamp(),
name varchar
);
create index ix_v1_uuid on v1 (id_uuid_v1);
create index ix_v4_uuid on v1 (id_uuid_v4);
create index ix_v1_t on v1 (t);
vacuum (verbose, analyze, freeze) v1;
select pg_size_pretty(pg_relation_size('v1')), count(*)
from v1;
-- emtpy the table
-- truncate v1;
-- generate 100K records
INSERT INTO v1 (name)
SELECT x.id
FROM generate_series(1,100000) AS x(id);
vacuum (verbose, analyze, freeze) v1;
vacuum (verbose, analyze, freeze) v1;
-- delete half of the records
delete from v1
where id <= (select avg(id) from v1);
-- free the pages
vacuum (verbose, analyze, freeze) v1;
vacuum (verbose, analyze, freeze) v1;
--vacuum full v1;
-- generate the number another half of the records
with c as (
select count(*) c
from v1
)
INSERT INTO v1 (name)
SELECT generate_series(0,c.c-1)
from c;
-- updated statistics
vacuum (verbose, analyze, freeze) v1;
select *
from v1
limit 10
Run Code Online (Sandbox Code Playgroud)
在此之后报告的膨胀是:
对于表
schemaname | tblname | bloat_ratio | bloat_size | bloat_size_pretty | real_size | real_size_pretty | ?column?
------------+---------+-------------+------------+-------------------+-----------+------------------+------------------------
public | v1 | 0.2 | 16384 | 16 kB | 8445952 | 8248 kB | VACUUM FULL public.v1;
Run Code Online (Sandbox Code Playgroud)
对于索引:
schemaname | tblname | idxname | bloat_ratio | bloat_size | bloat_size_pretty | real_size | real_size_pretty | ?column?
------------+---------+------------+-------------+------------+-------------------+-----------+------------------+----------------------------------
public | v1 | ix_v1_uuid | 31.9 | 1466368 | 1432 kB | 4603904 | 4496 kB | REINDEX INDEX public.ix_v1_uuid;
public | v1 | ix_v4_uuid | 29.7 | 1327104 | 1296 kB | 4464640 | 4360 kB | REINDEX INDEX public.ix_v4_uuid;
public | v1 | ix_v1_t | 0.4 | 8192 | 8192 bytes | 2260992 | 2208 kB | REINDEX INDEX public.ix_v1_t;
public | v1 | v1_pkey | 0.4 | 8192 | 8192 bytes | 2260992 | 2208 kB | REINDEX INDEX public.v1_pkey;
Run Code Online (Sandbox Code Playgroud)
我的期望是:
更新(期望的原因):
到底是怎么回事??!
附加信息来自vacuum verboze analyze
:
在任何事情之前
INFO: vacuuming "public.v1"
INFO: index "v1_pkey" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_uuid" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v4_uuid" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_t" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "v1": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_1701065"
INFO: index "pg_toast_1701065_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_1701065": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.v1"
INFO: "v1": scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows
Query returned successfully with no result in 37 msec.
Run Code Online (Sandbox Code Playgroud)
插入后
INFO: vacuuming "public.v1"
INFO: index "v1_pkey" now contains 100000 row versions in 276 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_uuid" now contains 100000 row versions in 388 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v4_uuid" now contains 100000 row versions in 534 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_t" now contains 100000 row versions in 276 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "v1": found 0 removable, 100000 nonremovable row versions in 1031 out of 1031 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO: vacuuming "pg_toast.pg_toast_1701082"
INFO: index "pg_toast_1701082_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_1701082": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.v1"
INFO: "v1": scanned 1031 of 1031 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows
Query returned successfully with no result in 179 msec.
Run Code Online (Sandbox Code Playgroud)
删除后,第一次运行真空
INFO: vacuuming "public.v1"
INFO: scanned index "v1_pkey" to remove 50000 row versions
DETAIL: CPU 0.00s/0.01u sec elapsed 0.00 sec
INFO: scanned index "ix_v1_uuid" to remove 50000 row versions
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec
INFO: scanned index "ix_v4_uuid" to remove 50000 row versions
DETAIL: CPU 0.00s/0.01u sec elapsed 0.01 sec
INFO: scanned index "ix_v1_t" to remove 50000 row versions
DETAIL: CPU 0.00s/0.01u sec elapsed 0.01 sec
INFO: "v1": removed 50000 row versions in 516 pages
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec
INFO: index "v1_pkey" now contains 50000 row versions in 276 pages
DETAIL: 50000 index row versions were removed.
136 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_uuid" now contains 50000 row versions in 388 pages
DETAIL: 50000 index row versions were removed.
191 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v4_uuid" now contains 50000 row versions in 534 pages
DETAIL: 50000 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_t" now contains 50000 row versions in 276 pages
DETAIL: 50000 index row versions were removed.
136 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "v1": found 50000 removable, 142 nonremovable row versions in 517 out of 1031 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.04u sec elapsed 0.05 sec.
INFO: vacuuming "pg_toast.pg_toast_1701082"
INFO: index "pg_toast_1701082_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_1701082": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.v1"
INFO: "v1": scanned 1031 of 1031 pages, containing 50000 live rows and 0 dead rows; 30000 rows in sample, 50000 estimated total rows
Query returned successfully with no result in 116 msec.
Run Code Online (Sandbox Code Playgroud)
二次真空
INFO: vacuuming "public.v1"
INFO: index "v1_pkey" now contains 50000 row versions in 276 pages
DETAIL: 0 index row versions were removed.
136 index pages have been deleted, 136 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_uuid" now contains 50000 row versions in 388 pages
DETAIL: 0 index row versions were removed.
191 index pages have been deleted, 191 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v4_uuid" now contains 50000 row versions in 534 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_t" now contains 50000 row versions in 276 pages
DETAIL: 0 index row versions were removed.
136 index pages have been deleted, 136 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "v1": found 0 removable, 90 nonremovable row versions in 1 out of 1031 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO: vacuuming "pg_toast.pg_toast_1701082"
INFO: index "pg_toast_1701082_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_1701082": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.v1"
INFO: "v1": scanned 1031 of 1031 pages, containing 50000 live rows and 0 dead rows; 30000 rows in sample, 50000 estimated total rows
Query returned successfully with no result in 272 msec.
Run Code Online (Sandbox Code Playgroud)
再次插入后,第一次抽真空
INFO: vacuuming "public.v1"
INFO: index "v1_pkey" now contains 100000 row versions in 276 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_uuid" now contains 100000 row versions in 562 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v4_uuid" now contains 100000 row versions in 545 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_t" now contains 100000 row versions in 276 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "v1": found 0 removable, 50142 nonremovable row versions in 517 out of 1031 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_1701082"
INFO: index "pg_toast_1701082_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_1701082": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.v1"
INFO: "v1": scanned 1031 of 1031 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows
Query returned successfully with no result in 180 msec.
Run Code Online (Sandbox Code Playgroud)
二次真空
INFO: vacuuming "public.v1"
INFO: index "v1_pkey" now contains 100000 row versions in 276 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_uuid" now contains 100000 row versions in 562 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v4_uuid" now contains 100000 row versions in 545 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_v1_t" now contains 100000 row versions in 276 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "v1": found 0 removable, 90 nonremovable row versions in 1 out of 1031 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_1701082"
INFO: index "pg_toast_1701082_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_1701082": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.v1"
INFO: "v1": scanned 1031 of 1031 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows
Query returned successfully with no result in 158 msec.
Run Code Online (Sandbox Code Playgroud)
样本数据
id | id_uuid_v1 | id_uuid_v4 | t | name
--------+--------------------------------------+--------------------------------------+-------------------------------+------
100001 | 21515a0a-f07c-11e7-9a90-509a4c44f517 | 18632922-249f-439f-9b54-6cb29f8c97a2 | 2018-01-03 12:49:18.368387+01 | 0
100002 | 2152bad0-f07c-11e7-9b55-509a4c44f517 | 0ba51425-7225-416f-a79f-b2db0ed23155 | 2018-01-03 12:49:18.377475+01 | 1
100003 | 2152bad1-f07c-11e7-a7a0-509a4c44f517 | 486fb3ae-204f-4c8a-8171-7089b2c9d946 | 2018-01-03 12:49:18.377699+01 | 2
100004 | 2152ce26-f07c-11e7-8172-509a4c44f517 | 83ec70ea-69bd-45a7-800c-eac558521e19 | 2018-01-03 12:49:18.3779+01 | 3
100005 | 2152ce27-f07c-11e7-800d-509a4c44f517 | cc5f356f-589d-4bcd-b627-b731e496c56d | 2018-01-03 12:49:18.378173+01 | 4
100006 | 2152e1b8-f07c-11e7-b628-509a4c44f517 | 495e4216-76ca-4806-90d8-d0cc1d8ea53f | 2018-01-03 12:49:18.378373+01 | 5
100007 | 2152e1b9-f07c-11e7-90d9-509a4c44f517 | aaa824bf-34e1-4ec9-a589-b86ad77d9df0 | 2018-01-03 12:49:18.37857+01 | 6
100008 | 2152f54a-f07c-11e7-a58a-509a4c44f517 | 0474fd4a-6f2c-4717-a59d-d6c6337eabee | 2018-01-03 12:49:18.378765+01 | 7
100009 | 2152f54b-f07c-11e7-a59e-509a4c44f517 | 5f416afa-7a4f-4692-bbe1-991a99cc8e25 | 2018-01-03 12:49:18.378958+01 | 8
100010 | 2152f54c-f07c-11e7-bbe2-509a4c44f517 | f32e5a17-40b1-4ab0-af41-a3557ff4fb3e | 2018-01-03 12:49:18.379249+01 | 9
Run Code Online (Sandbox Code Playgroud)
检查索引的代码:
CREATE OR REPLACE VIEW public.bloat_inx AS
SELECT o.schemaname::text AS schemaname,
o.tblname::text AS tblname,
o.idxname::text AS idxname,
round(o.bloat_ratio::numeric, 1) AS bloat_ratio,
o.bloat_size::numeric AS bloat_size,
pg_size_pretty(o.bloat_size::bigint) AS bloat_size_pretty,
o.relpages * 8 * 1024::bigint AS real_size,
pg_size_pretty(o.relpages * 8 * 1024::bigint) AS real_size_pretty,
((('REINDEX INDEX '::text || o.schemaname::text) || '.'::text) || o.idxname::text) || ';'::text
FROM ( SELECT current_database() AS current_database,
sub.nspname AS schemaname,
sub.tblname,
sub.idxname,
sub.bs * (sub.relpages::double precision - sub.est_pages)::bigint::numeric AS extra_size,
100::double precision * (sub.relpages::double precision - sub.est_pages) / sub.relpages::double precision AS extra_ratio,
sub.fillfactor,
sub.bs::double precision * (sub.relpages::double precision - sub.est_pages_ff) AS bloat_size,
100::double precision * (sub.relpages::double precision - sub.est_pages_ff) / sub.relpages::double precision AS bloat_ratio,
sub.relpages,
sub.is_na
FROM ( SELECT COALESCE(1::double precision + ceil(s2.reltuples / floor((s2.bs - s2.pageopqdata::numeric - s2.pagehdr::numeric)::double precision / (4::numeric + s2.nulldatahdrwidth)::double precision)), 0::double precision) AS est_pages,
COALESCE(1::double precision + ceil(s2.reltuples / floor(((s2.bs - s2.pageopqdata::numeric - s2.pagehdr::numeric) * s2.fillfactor::numeric)::double precision / (100::double precision * (4::numeric + s2.nulldatahdrwidth)::double precision))), 0::double precision) AS est_pages_ff,
s2.bs,
s2.nspname,
s2.table_oid,
s2.tblname,
s2.idxname,
s2.relpages,
s2.fillfactor,
s2.is_na
FROM ( SELECT s1.maxalign,
s1.bs,
s1.nspname,
s1.tblname,
s1.idxname,
s1.reltuples,
s1.relpages,
s1.relam,
s1.table_oid,
s1.fillfactor,
当您使用顺序 UUID 变体时,UUID 索引会变得更加高效。一段时间以来,出现了许多或多或少与 RFC 4122 兼容的顺序 UUID 变体(如 ULID、KSUID 和 XID)。然而,现在也有人建议使用 UUIDv6、UUIDv7 和 UUIDv8 来扩展 RFC 4122,这三个文件都可以按时间戳部分进行字节排序。
\n我最喜欢的 UUIDv7 实现是pg_uuidv7
. UUIDv7 基于 Unix 时间戳,但具有足够大小的随机块以避免冲突(在实现pg_uuidv7
\xe2\x80\x93 中,该块\n可能是随机的)。如果您想要进行比较,我最近对截至 2023 年第二季度的PostgreSQL 顺序 UUID 情况进行了调查。