标签: postgresql-performance

分析 postgres 中插入的性能

我有一个语句正在将一堆行（这个问题的内容或位置并不重要）插入到 Postgres 数据库中，但它没有我想要的那么快。我可以运行一个解释查询来查看它在做什么，我得到如下结果：

 Insert on dpdb.datapoints  (cost=0.00..6917.76 rows=44184 width=1786) (actual time=15558.623..15558.623 rows=0 loops=1)
   Buffers: shared hit=34670391 read=98370 dirtied=48658 written=39875
   I/O Timings: read=704.525 write=242.915
   ->  Seq Scan on public.fred  (cost=0.00..6917.76 rows=44184 width=1786) (actual time=0.018..197.853 rows=44184 loops=1)
         Output: nextval('datapoints_id_seq'::regclass), fred.company_id, fred.tag, ... lots more columns ...
         Buffers: shared hit=44186 read=6253 dirtied=1
         I/O Timings: read=29.176
 Planning time: 0.110 ms
 Trigger RI_ConstraintTrigger_c_14845718 for constraint datapoints_tag_source_fkey: time=236.677 calls=44184
 Trigger RI_ConstraintTrigger_c_14845723 for constraint datapoints_sheet_type_fkey: time=536.367 calls=44184
 Trigger RI_ConstraintTrigger_c_14845728 for constraint datapoints_subcontext_fkey: time=178.200 calls=44184
 Trigger RI_ConstraintTrigger_c_14845733 for …

Run Code Online (Sandbox Code Playgroud)

postgresql postgresql-performance

Ric*_*don

2020 10-03

6
推荐指数

1
解决办法

3077
查看次数

如何使用 pg_trgm 改进或加速 Postgres 查询？

我可以采取任何其他步骤来加快查询执行速度吗？

我有一个超过 100m 行的表，我需要搜索匹配的字符串。为此，我检查了两个选项：

将文本与 to_tsvector @@（to_tsquery 或 plainto_tsquery）进行比较
\n这工作得非常快（所有数据都在 1 秒以下），但在查找文本相似性方面存在一些问题
将文本与 pg_trgm 相似度进行比较\n这对于文本比较效果很好，但对于大量数据则效果不佳。

我发现我可以使用索引来提高性能。\n对于我的 GiST 索引，我尝试siglen从小数字增加到 2024，但由于某种原因 Postgres 使用512而不是更高。

CREATE INDEX trgm_idx_512_gg ON table USING GIST (name gist_trgm_ops(siglen=512));\n

Run Code Online (Sandbox Code Playgroud)\n

询问：

SELECT name, similarity(name, '\xd0\xbd\xd0\xbe\xd1\x83\xd1\x82\xd0\xb1\xd1\x83\xd0\xba MSI GF63 Thin 10SC 086XKR 9S7 16R512 086') as sm\nFROM table\nWHERE name % '\xd0\xbd\xd0\xbe\xd1\x83\xd1\x82\xd0\xb1\xd1\x83\xd0\xba MSI GF63 Thin 10SC 086XKR 9S7 16R512 086' \n

Run Code Online (Sandbox Code Playgroud)\n

EXPLAIN输出：

Bitmap Heap Scan on …

Run Code Online (Sandbox Code Playgroud)

postgresql postgresql-performance pg-trgm

Dmi*_*ich

2023 01-16

6
推荐指数

1
解决办法

1607
查看次数

更快地搜索字段的第1个字符与[A-Za-z]不匹配的记录？

我目前有以下内容:

User (id, fname, lname, deleted_at, guest)

Run Code Online (Sandbox Code Playgroud)

我可以通过它们的fname初始查询用户列表,如下所示:

User Load (9.6ms)  SELECT "users".* FROM "users" WHERE (users.deleted_at IS NULL) AND (lower(left(fname, 1)) = 's') ORDER BY fname ASC LIMIT 25 OFFSET 0

Run Code Online (Sandbox Code Playgroud)

由于以下索引,这很快:

  CREATE INDEX users_multi_idx
  ON users (lower(left(fname, 1)), fname)
  WHERE deleted_at IS NULL;

Run Code Online (Sandbox Code Playgroud)

我现在想要做的是能够查询所有不以字母AZ开头的用户.我这样做是这样的:

SELECT "users".* FROM "users" WHERE (users.deleted_at IS NULL) AND (lower(left(fname, 1)) ~ E'^[^a-zA-Z].*') ORDER BY fname ASC LIMIT 25 OFFSET 0

Run Code Online (Sandbox Code Playgroud)

但问题是这个查询非常慢并且似乎没有使用索引来加速第一个查询.关于如何优雅地使第二个查询(非az)更快的任何建议？

我正在使用带有rails 3.2的Postgres 9.1

谢谢

postgresql ruby-on-rails ruby-on-rails-3 postgresql-performance

AnA*_*ice

2012 10-17

5
推荐指数

1
解决办法

225
查看次数

Postgres:按日期时间优化查询

我有一个日期时间字段为"updated_at"的表.我的很多查询都会使用范围查询来查询此字段,例如update_at>某个日期的行.

我已经为updated_at添加了一个索引,但是我的大多数查询仍然非常慢,即使我对返回的行数有限制.

我还可以做些什么来优化查询日期时间字段的查询？

sql postgresql performance postgresql-performance

Hen*_*hiu

2013 05-20

5
推荐指数

2
解决办法

7119
查看次数

索引以查找不存在外键的记录

table products
id primary_key

table transactions
product_id foreign_key references products

Run Code Online (Sandbox Code Playgroud)

下面的SQL查询非常慢：

SELECT products.* 
FROM   products 
       LEFT JOIN transactions 
              ON ( products.id = transactions.product_id ) 
WHERE  transactions.product_id IS NULL;

Run Code Online (Sandbox Code Playgroud)

在1亿个产品记录中，可能只有100条记录中没有相应交易的产品。

该查询非常慢，因为我怀疑它正在进行全表扫描以查找那些空外键产品记录。

我想创建这样的部分索引：

CREATE INDEX products_with_no_transactions_index 
ON (Left JOIN TABLE 
    BETWEEN products AND transactions) 
WHERE transactions.product_id IS NULL;

Run Code Online (Sandbox Code Playgroud)

以上可能吗，我将如何处理？

注意：此数据集的一些特征：

交易永远不会被删除，只会被添加。
产品永远不会被删除，而是以每分钟100s的速度添加（显然，这是一个复杂得多的实际用例背后的虚构示例）。其中的一小部分是暂时孤立的
我需要经常查询（每分钟最多一次），并且需要始终知道当前的一组孤立产品是什么

sql postgresql indexing materialized-views postgresql-performance

sam*_*mol

2014 01-02

5
推荐指数

1
解决办法

3233
查看次数

使用NOT IN的DELETE的性能（选择...）

我有这两个表，并希望从ms_author中删除所有不在author中的作者。

author （160万行）

+-------+-------------+------+-----+-------+
| Field | Type        | Null | Key | index |
+-------+-------------+------+-----+-------+
| id    | text        | NO   | PRI | true  |
| name  | text        | YES  |     |       |
+-------+-------------+------+-----+-------+

Run Code Online (Sandbox Code Playgroud)

ms_author （1.2亿行）

+-------+-------------+------+-----+-------+
| Field | Type        | Null | Key | index |
+-------+-------------+------+-----+-------+
| id    | text        | NO   | PRI |       |
| name  | text        | YES  |     | true  |
+-------+-------------+------+-----+-------+

Run Code Online (Sandbox Code Playgroud)

这是我的查询：

    DELETE
FROM ms_author AS m
WHERE …

Run Code Online (Sandbox Code Playgroud)

sql postgresql postgresql-performance sql-delete

Seb*_*bas

2015 12-15

5
推荐指数

1
解决办法

2108
查看次数

优化行排除查询

我正在设计一个大多数只读数据库,其中包含300,000个文档,大约有50,000个不同的标记,每个文档平均有15个标记.目前,我唯一关心的查询是从给定的一组标签中选择没有标签的所有文档.我只对document_id列感兴趣(结果中没有其他列).

我的架构基本上是:

CREATE TABLE documents (
    document_id  SERIAL  PRIMARY KEY,
    title        TEXT
);

CREATE TABLE tags (
    tag_id  SERIAL  PRIMARY KEY,
    name    TEXT    UNIQUE
);

CREATE TABLE documents_tags (
    document_id    INTEGER  REFERENCES documents,
    tag_id         INTEGER  REFERENCES tags,

    PRIMARY KEY (document_id, tag_id)
);

Run Code Online (Sandbox Code Playgroud)

我可以通过预先计算给定标记的文档集来用Python编写此查询,从而将问题简化为一些快速设置操作:

In [17]: %timeit all_docs - (tags_to_docs[12345] | tags_to_docs[7654])
100 loops, best of 3: 13.7 ms per loop

Run Code Online (Sandbox Code Playgroud)

然而,将设置操作转换为Postgres并不是那么快:

stuff=# SELECT document_id AS id FROM documents WHERE document_id NOT IN (
stuff(#     SELECT documents_tags.document_id …

Run Code Online (Sandbox Code Playgroud)

sql postgresql indexing performance postgresql-performance

use*_*188

2016 08-07

5
推荐指数

1
解决办法

127
查看次数

慢速嵌套循环左循环连接索引扫描130k次

我真的很难优化这个查询:

SELECT wins / (wins + COUNT(loosers.match_id) + 0.) winrate, wins + COUNT(loosers.match_id) matches, winners.winning_champion_one_id, winners.winning_champion_two_id, winners.winning_champion_three_id, winners.winning_champion_four_id, winners.winning_champion_five_id
FROM
(
   SELECT COUNT(match_id) wins, winning_champion_one_id, winning_champion_two_id, winning_champion_three_id, winning_champion_four_id, winning_champion_five_id FROM matches
   WHERE
      157 IN (winning_champion_one_id, winning_champion_two_id, winning_champion_three_id, winning_champion_four_id, winning_champion_five_id)
   GROUP BY winning_champion_one_id, winning_champion_two_id, winning_champion_three_id, winning_champion_four_id, winning_champion_five_id
) winners
LEFT OUTER JOIN matches loosers ON
  winners.winning_champion_one_id = loosers.loosing_champion_one_id AND
  winners.winning_champion_two_id = loosers.loosing_champion_two_id AND
  winners.winning_champion_three_id = loosers.loosing_champion_three_id AND
  winners.winning_champion_four_id = loosers.loosing_champion_four_id AND
  winners.winning_champion_five_id = loosers.loosing_champion_five_id
GROUP BY winners.wins, winners.winning_champion_one_id, winners.winning_champion_two_id, winners.winning_champion_three_id, winners.winning_champion_four_id, …

Run Code Online (Sandbox Code Playgroud)

postgresql indexing database-performance postgresql-performance postgresql-9.6

Fer*_*yan

2017 04-23

5
推荐指数

2
解决办法

5314
查看次数

Postgresql Select视图使用where子句太慢

问题在视图执行时间和表执行时间之间是不同的。

为什么两个查询之间有这么多差异？

使用“ LIMIT”子句的查询2查询的速度越来越快，但现在却非常慢。（例如Select * From ... limit 1000）

注意：我在VIEW中使用的功能不会做任何艰苦的工作。语言plpgsql。

查询1：执行时间= 106.726 ms;

   EXPLAIN ANALYZE SELECT * FROM dbo.ST_Foto    where     StationId='Sample Guid'  
        and  Tarih <='2018-09-12'   -- Table

Run Code Online (Sandbox Code Playgroud)

查询1解释分析;

  "Seq Scan on st_foto  (cost=0.00..4559.68 rows=103878 width=215) (actual time=0.036..81.046 rows=103938 loops=1)"
"  Filter: ((tarih <= '2018-09-12 00:00:00'::timestamp without time zone) AND (stationid = 'ac5a2189-f931-47c0-9845-d0ff8eac7cb7'::uuid))"
"  Rows Removed by Filter: 20174"
"Planning time: 0.359 ms"
"Execution time: 106.726 ms"

Run Code Online (Sandbox Code Playgroud)

查询2执行时间= 30562.223 ms;

  EXPLAIN ANALYZE SELECT * FROM dbo.VWST_Foto    where     StationId='Sample Guid'  
            and  Tarih …

Run Code Online (Sandbox Code Playgroud)

postgresql view stored-functions postgresql-performance

Cem*_*ğut

2018 09-12

5
推荐指数

0
解决办法

334
查看次数

在PostgreSQL中获取表的较大百分比时，为什么位图扫描比索引扫描更快？

位图扫描的作者描述了位图堆扫描和索引扫描之间的区别：

一个普通的indexscan一次从索引中获取一个元组指针，然后立即访问表中的该元组。位图扫描一次性从索引中获取所有元组指针，使用内存中的“位图”数据结构对其进行排序，然后以物理元组位置顺序访问表元组。位图扫描提高了表的引用局部性，但要花更多的簿记开销来管理“位图”数据结构---并且不再按索引顺序检索数据，这对您来说无关紧要查询，但是如果您说ORDER BY会很重要。

问题：