zam*_*6ak 2 postgresql performance size disk-space postgresql-performance
我们有一个大型查询,当客户“第一次运行它时,一大早......”
所以,我发现pg_prewarm我想使用加载到 PG 的缓冲区缓存一定数量或最近访问的行(插入、更新或删除)来自上述查询中使用的几个表。
此外,我需要确保“预热”不超过 PG 的缓存(我相信是 shared_buffers 设置,还是我错了?)为了预热单个表的最后 1000 页,我可以这样做:
SELECT pg_prewarm(
'mytable',
-- "pre warm" last 1000 pages
first_block := (
SELECT pg_relation_size('mytable') / current_setting('block_size')::int4 - 1000
)
);
Run Code Online (Sandbox Code Playgroud)
问题 1:这种方法有意义吗?
诀窍是 pg_prewarm 只能加载一定数量的页面,所以我需要计算“某个表的页面中有多少活动行”
-- show some settings
SELECT current_setting('block_size')::int4 AS page_size_bytes; -- 8192
SHOW shared_buffers; -- 512 MB
-- https://www.postgresql.org/docs/current/static/pgstattuple.html
--CREATE EXTENSION pgstattuple;
-- find out live row size and live rows per page
SELECT 'mytable'AS table_name, pg_size_pretty(tuple_len / tuple_count) AS live_row_size, 8192.00 / (tuple_len / tuple_count) AS live_rows_per_page, * FROM pgstattuple('public.mytable')
--"table_name","live_row_size","live_rows_per_page","table_len","tuple_count","tuple_len","tuple_percent","dead_tuple_count","dead_tuple_len","dead_tuple_percent","free_space","free_percent"
--"studies","1286 bytes",6.3701399688958009,652697600,462123,594431269,91.07,0,0,0,52329672,8.02
Run Code Online (Sandbox Code Playgroud)
问题 2:我上面的查询是否正确?这是获得“每页实时行”的正确方法吗? 问题 #3:我从上面的查询中得到的 live_row_size 与这个答案(由 Erwin 提供)中提到的结果不同。难道我做错了什么?
基于 live_rows_per_page 然后我可以修改 pg_prewarm 以加载足够多的包含 10,000 行(6.37 x 10,000)的最后 XXXX 页
SELECT pg_prewarm(
'mytable',
-- "pre warm" pages of the last 10,000 rows for 'mytable'
first_block := (
SELECT pg_relation_size('mytable') / current_setting('block_size')::int4 - 63700
)
);
Run Code Online (Sandbox Code Playgroud)
更新 #1
关于问题 #3,当我运行查询时我得到以下mytable
... pgstatuple 的输出是不同的,可能是因为它没有列出相同的项目,但我不确定...
"what","bytes/ct","bytes_pretty","bytes_per_row"
"core_relation_size",652697600,"622 MB",1412
"visibility_map",16384,"16 kB",0
"free_space_map",180224,"176 kB",0
"table_size_incl_toast",1101955072,"1051 MB",2384
"indexes_size",508289024,"485 MB",1099
"total_size_incl_toast_and_indexes",1610244096,"1536 MB",3484
"live_rows_in_text_representation",1138946462,"1086 MB",2464
"------------------------------",<NULL>,"<NULL>",<NULL>
"row_count",462123,"<NULL>",<NULL>
"live_tuples",0,"<NULL>",<NULL>
"dead_tuples",3,"<NULL>",<NULL>
Run Code Online (Sandbox Code Playgroud)
我从上面的查询中得到的 live_row_size 与这个答案(由 Erwin 提供)中提到的结果不同。难道我做错了什么?
你有:
Run Code Online (Sandbox Code Playgroud)SELECT pg_size_pretty(tuple_len / tuple_count) AS live_row_size FROM pgstattuple('public.mytable');
tuple_len
... 活动元组的总长度(以字节为单位)
tuple_count
... 活动元组的数量
您计算活动元组的中等长度。似乎您希望 Postgres 在将数据页提取到 RAM 中时仅提取活动元组,但这不是它的工作原理。Postgres 只是将整页读入 RAM,包括可能包含的任何死元组。
因此,你的下一个表达式并没有计算“每页的行活”:
Run Code Online (Sandbox Code Playgroud)8192.00 / (tuple_len / tuple_count) AS live_rows_per_page
它计算“只有活动元组的每个数据页的活动行的假设最大值”,这很高兴知道,但否则对您的任务无用。
为什么尺寸不一样?在引用的答案中,我是这样引导的:
这将证明测量“行大小”的各种方法会导致非常不同的结果。这一切都取决于您想要准确测量什么。
你的发现似乎证实了这一点——尽管我不确定你到底在比较什么。
引用的答案有点过时了。我改进了一些细节并添加了pgstattuple中的数字以进行比较。您需要为每个数据库安装一次模块(您显然拥有它,但也许其他读者):
CREATE EXTENSION pgstattuple;
Run Code Online (Sandbox Code Playgroud)
然后:
SELECT l.what, l.nr AS "bytes/ct"
, CASE WHEN is_size THEN pg_size_pretty(nr) END AS bytes_pretty
, CASE WHEN is_size THEN nr
/ CASE part WHEN 1 THEN NULLIF(x.ct, 0)
WHEN 2 THEN NULLIF(st.tuple_count, 0)
WHEN 3 THEN NULLIF(st.dead_tuple_count, 0)
WHEN 4 THEN NULLIF(st.tuple_count + st.dead_tuple_count, 0) END
END AS bytes_per_row
FROM (
SELECT min(tableoid) AS tbl -- same as 'public.tbl'::regclass::oid
, count(*) AS ct
, sum(length(t::text)) AS txt_len -- length in characters
FROM public.big t -- provide table name *once*
) x
, LATERAL pgstattuple(tbl) st -- also get numbers from pgstattuple
, LATERAL (
VALUES
(1, false, 'row_count' , ct)
, (1, false, ' ------ DB_obj_size_func -------' , NULL)
, (1, true , 'core_relation_size' , pg_relation_size(tbl))
, (1, true , 'visibility_map' , pg_relation_size(tbl, 'vm'))
, (1, true , 'free_space_map' , pg_relation_size(tbl, 'fsm'))
, (1, true , 'table_size_incl_toast' , pg_table_size(tbl))
, (1, true , 'indexes_size' , pg_indexes_size(tbl))
, (1, true , 'total_size_incl_toast_and_indexes' , pg_total_relation_size(tbl))
, (1, true , 'live_rows_in_text_representation' , txt_len)
, (2, false, ' ------ pgstattuple ------------' , NULL)
, (2, false, 'live_tuples' , st.tuple_count)
, (2, false, 'dead_tuples' , st.dead_tuple_count)
, (2, true , 'total table / live tuples' , st.table_len)
, (2, true , 'live table / live tuples' , st.tuple_len)
, (3, true , 'dead table / dead tuples' , st.dead_tuple_len)
, (4, true , 'total table / all tuples' , st.table_len)
) l(part, is_size, what, nr);
Run Code Online (Sandbox Code Playgroud)
如果您的查询没有副作用,最有效的方法应该是执行它以将相关数据页提取到 RAM 中。