PostgreSQL 查询性能因 WHERE 子句值而异

根据user_id我在子句中使用的查询，我遇到了性能问题WHERE。

这个问题描述了一个非常相似的问题，但不完全相同：

PostgreSQL 过滤器/聚合性能根据条件值而波动

这是我的查询：

select inlineview.user_search_id, inlineview.user_id, 
       to_char(timezone('UTC', to_timestamp(date_part('epoch', 
       inlineview.last_access_date))),'YYYY-MM-DD"T"HH24:MI:SS.MSZ') 
       last_access_date,
       to_char(timezone('UTC', to_timestamp(date_part('epoch', 
       inlineview.sys_creation_date))),'YYYY-MM-DD"T"HH24:MI:SS.MSZ') 
       sys_creation_date, b.product_id,
       to_char(timezone('UTC', to_timestamp(date_part('epoch', 
       b.last_prov_change))),'YYYY-MM-DD"T"HH24:MI:SS.MSZ') 
      last_prov_change,b.product_type,b.tlc_id,b.prod_history_id,
      coalesce(d.address_id,-1) address_id,coalesce(d.locality_name,'') 
      locality_name,
      coalesce(d.post_code_zone,'') post_code_zone,
      coalesce(d.road_name_concat,'') 
      road_name_concat,coalesce(d.street_num_concat,'') street_num_concat,
      coalesce(d.address_name,'') address_name,
      coalesce(d.town_name,'') town_name
from
    (select a.user_search_id, 
            a.user_id,a.last_access_date,
            a.sys_creation_date,a.product_id
     from linetest.user_search a
    where a.user_id = '818901'
    order by a.last_access_date desc limit 10) inlineview, 
 linetest.prod_history b left outer join linetest.address d on d.tlc_id = b.tlc_id 
 where b.product_id = inlineview.product_id
   and b.prod_history_id = (select c.prod_history_id 
                            from linetest.prod_history c
                            where …

Run Code Online (Sandbox Code Playgroud)

postgresql performance index execution-plan postgresql-performance

brg*_*eek

2020 01-08

2
推荐指数

1
解决办法

1845
查看次数

对具有 GIN 索引的 PostgreSQL 表的偶尔/间歇性、缓慢（10+ 秒）更新查询

设置

我在基于 SSD 的四核虚拟专用服务器 (VPS) 和 Debian Linux (8) 上运行 PostgreSQL 9.4.15。相关表有大约 200 万条记录。

记录经常被插入，甚至更频繁地（不断——至少每隔几秒钟）更新一次。据我所知，我已经为这些操作准备了所有适当的索引，以便快速执行，而且绝大多数时间它们确实会立即执行（以毫秒为单位）。

问题

然而，每隔一小时左右，其中一个UPDATE查询就会花费过多的时间——比如 10 秒或更长时间。当这种情况发生时，它通常就像“一批”被“阻塞”的查询，几乎同时终止。就好像其中一个查询或其他一些后台操作（例如，真空）正在阻止它们。

架构

表 ,items有很多列，但我认为以下是唯一可能与问题相关的列：

id INTEGER NOT NULL （首要的关键）
search_vector TSVECTOR
last_checkup_at TIMESTAMP WITHOUT TIME ZONE

这些是相关的索引：

items_pkey PRIMARY KEY, btree (id)
items_search_vector_idx gin (search_vector)
items_last_checkup_at_idx btree (last_checkup_at)

可能的罪魁祸首

最后，当pg_stat_activity我的日志文件中发出“连接泄漏”警告时，在组装了一个小脚本以转储（所有活动 Postgres 连接/查询的列表）的内容后，我缩小了可能的罪魁祸首查询/列（假设问题不是外部的，比如行为不端的 VPS）。粗略地说，这些查询似乎一次又一次地出现：

UPDATE items SET last_checkup_at = $1 WHERE items.id = 123245
UPDATE items SET search_vector = [..] WHERE items.id = 78901

这些略有解释，但我真的怀疑缺少任何相关内容。偶尔也会出现其他查询（在其他表上），但这些查询通常看起来只是“不走运”而被卷入其中。

现在，即使第一个查询（设置last_checkup_at …

postgresql performance index gin-index postgresql-performance

Chr*_* W.

2020 01-08

2
推荐指数

1
解决办法

1009
查看次数

提高 postgres 中 COPY COMMAND 的速度

我想了解增加是否work_mem有助于提高命令的速度COPY。

是否COPY使用work_mem或maint_work_mem广泛使用？

postgresql performance copy postgresql-performance

Ram*_*mya

2020 01-08

2
推荐指数

1
解决办法

4984
查看次数

按主键查找一行，当你不知道它在哪个表中时？

我是负责专有 OODBMS 的（非正式）DBA。管理层希望我们迁移到 Postgres，以降低许可成本。移动应该是渐进的，因此我们也应该保持与 OODBMS 中相同的数据结构。幸运的是，有了对数组和表继承的支持，我们可以在 Postgres 中创建完全相同的模式。所有表都将使用 bigint 作为主键，并且所有（间接）从同一个基表继承。

最大的问题是：我们的 bigint 键在所有表中（并且必须）是唯一的，我们必须能够根据主键快速加载一组行，而无需知道它们在哪个表中。行将分布在任何和所有表上。这里主要关注的是速度，而不是磁盘或内存使用。

换句话说，我们需要的是一个跨所有表的唯一索引。AFAIK，这在 Postgres 中是不可能的。什么是“次优”选项？我对任何解决方案持开放态度，包括使用甚至编码一些“Postgres 扩展”。

为了给出有关实际 DB 的一些提示，我们谈论的是 300 个表、130M 行和大约 300GB 的大小（OODBMS 大小）。

performance postgresql-performance

Seb*_*iot

2020 01-08

2
推荐指数

1
解决办法

294
查看次数

Postgresql 是否为不变更新语句创建一个新元组？

我需要修复 Postgresql 数据库中的一些数据问题，其中主要是从文本列中修剪空格。我正在使用如下语句执行此操作：

UPDATE app.products
SET "description" = TRIM(BOTH FROM "description")

Run Code Online (Sandbox Code Playgroud)

在测试数据库上发出这个返回UPDATE 2000. 测试表包含 2000 行，其中只有 1 行实际上会随着空白修剪而改变。这是否会导致 Postgres 创建新的、不必要的元组？添加 WHERE 子句会有好处吗？大多数目标列都没有编入索引。

UPDATE app.products
SET "description" = TRIM(BOTH FROM "description")
WHERE "description" LIKE ' %' OR "description" LIKE '% '

Run Code Online (Sandbox Code Playgroud)

postgresql postgresql-performance

Gri*_*cey

lucky-day

2
推荐指数

1
解决办法

124
查看次数

Postgresql 10：使用精确堆块进行位图堆扫描

我有以下查询：

select ro.*
from courier c1
    join courier c2 on c2.real_physical_courier_1c_id = c1.real_physical_courier_1c_id
    join restaurant_order ro on ro.courier_id = c2.id
    left join jsonb_array_elements(items) jae on true
    left join jsonb_array_elements(jae->'options') ji on true
    inner join catalogue c on c.id in ((jae->'id')::int, (ji->'id')::int)
    join restaurant r on r.id = ro.restaurant_id
where c1.id = '7b35cdab-b423-472a-bde1-d6699f6cefd3' and ro.status in (70, 73)
group by ro.order_id, r.id ;

Run Code Online (Sandbox Code Playgroud)

以下是占用约 95% 时间的查询计划的一部分：

->  Parallel Bitmap Heap Scan on restaurant_order ro  (cost=23.87..2357.58 rows=1244 width=1257) (actual time=11.931..38.163 rows=98 loops=2)" …

Run Code Online (Sandbox Code Playgroud)

postgresql execution-plan postgresql-performance

Vad*_*hin

lucky-day

2
推荐指数

1
解决办法

973
查看次数

如果PostgreSQL数据库或服务器异常关闭，脏数据会怎样？

假设我的 PostgreSQL 实例突然异常停止，并且数据是脏的并且没有保存在 RAM 或 WAL 中。那样的话会发生什么？

我可以从哪里恢复这些数据？如果服务器也关闭，在这种情况下脏数据会发生什么？

是否有任何单独的文件来存储脏数据以进行恢复？

postgresql database-design postgresql-performance postgresql-13 postgresql-14

She*_*sib

2022 07-07

2
推荐指数

1
解决办法

274
查看次数

提高大表过滤左外连接的查询性能

我正在尝试优化在 PostgreSQL 15.4 中连接两个大表（40MM+ 行）的查询。

SELECT files.id, ARRAY_AGG(b.status)\nFROM files\nLEFT OUTER JOIN processing_tasks b\n    ON (files.id = b.file_id AND b.job_id = 113)\nWHERE files.round_id = 591\nGROUP BY files.id;\n

Run Code Online (Sandbox Code Playgroud)\n

explain (analyze)完全相同的查询的两个计划位于：

https://explain.depesz.com/s/cUXB需要 87 秒，使用并行 Seq 扫描processing_tasks.job_id（默认计划）
\n
https://explain.depesz.com/s/j39G需要 4 秒，使用位图索引扫描processing_tasks.job_id（当时set local enable_seqscan = OFF）
\n

在中files，908,275 / 39,000,105 (2.3%) 个元组有round_id=591; 它是静态的。
\n在中processing_tasks，4,026,364 / 60,780,802 (6.6%) 个元组有job_id=113，并且随着行的插入，这个值将变得越来越常见，可能达到表的 …

postgresql optimization query-performance postgresql-performance postgresql-15

Jef*_*f G

2023 12-08

2
推荐指数

1
解决办法

143
查看次数

优化 Postgres 查询

我有从用户到地址表的一对一关系。一位用户可以拥有一个搜索地址和一个经过验证的地址。

我在地址表上有两个索引：

状态字段索引
user_id 上的索引

我正在尝试仅为某些用户获取地址，而那些状态不是manual_verification.

这是我的查询：

SELECT users.id 
FROM "users" INNER JOIN addresses 
     ON  addresses.user_id = users.id 
     and addresses.type = 'VerifiedAddress' 
WHERE ("users".deleted_at IS NULL) 
  AND (users.id in (11144,10569,21519,783,15671,21726,17787,11665,
                    19579,12226,1324,9413,5461,20981,12906) 
  and addresses.state != 'manual_verification')

Run Code Online (Sandbox Code Playgroud)

解释上面的查询：http : //explain.depesz.com/s/rTj

需要 37 毫秒。有时更多取决于用户数量。

我认为这是一个很好的查询，但是我们的团队需要对此进行调查，我正在寻找一些优化技巧。我的意思是我做了一个字段选择，user_id（地址）和状态（地址）上有一个索引。

还有什么我可以做/尝试的吗？

更新

我发现这个查询的工作速度要快得多：

 SELECT "addresses"."user_id" 
    FROM "addresses" 
    WHERE "addresses"."type" IN ('VerifiedAddress') 
    AND (user_id in (9681,23824,23760,20098,962,14730,12294,9552,534,
                     553,5837,6768,6583,956,24179) and state != 'manual_verification')

Run Code Online (Sandbox Code Playgroud)

解释这个查询：http : //explain.depesz.com/s/nHrr

postgresql performance index postgresql-performance

Gan*_*row

2020 01-08

1
推荐指数

1
解决办法

248
查看次数

Postgresql 等价于 sqlite pragma

如果我想从 postgresql 中获得最佳性能，那么下面的 sqlite pragma 相当于什么。

pragma synchronous = OFF;
pragma journal_mode = OFF;
pragma count_changes = OFF;
pragma temp_store = MEMORY;

Run Code Online (Sandbox Code Playgroud)

postgresql sqlite performance postgresql-performance

sky*_*yde

2020 01-08

1
推荐指数

1
解决办法

1630
查看次数

标签统计

postgresql-performance ×10

postgresql ×9

performance ×6

index ×3

execution-plan ×2

copy ×1

database-design ×1

gin-index ×1

optimization ×1

postgresql-13 ×1

postgresql-14 ×1

postgresql-15 ×1

query-performance ×1

sqlite ×1

标签: postgresql-performance

设置

问题

架构

可能的罪魁祸首

标签 统计

标签统计