相关疑难解决方法(0)

Postgres 中空间查询的 3d 点数据的良好布局？

就像另一个问题所示，我在 3D 空间中处理了很多（> 10,000,000）个点条目。这些点定义如下：

CREATE TYPE float3d AS (
  x real,
  y real,
  z real);

Run Code Online (Sandbox Code Playgroud)

如果我没记错的话，需要 3*8 字节 + 8 字节填充（MAXALIGN是 8）来存储这些点之一。有没有更好的方法来存储这种数据？在前面提到的问题中，有人指出复合类型涉及相当多的开销。

我经常做这样的空间查询：

  SELECT t1.id, t1.parent_id, (t1.location).x, (t1.location).y, (t1.location).z,
         t1.confidence, t1.radius, t1.skeleton_id, t1.user_id,
         t2.id, t2.parent_id, (t2.location).x, (t2.location).y, (t2.location).z,
         t2.confidence, t2.radius, t2.skeleton_id, t2.user_id
  FROM treenode t1
       INNER JOIN treenode t2 ON
         (   (t1.id = t2.parent_id OR t1.parent_id = t2.id)
          OR (t1.parent_id IS NULL AND t1.id = t2.id))
        WHERE (t1.LOCATION).z = 41000.0
          AND (t1.LOCATION).x > 2822.6
          AND (t1.LOCATION).x …

Run Code Online (Sandbox Code Playgroud)

postgresql datatypes spatial composite-types

tom*_*mka

2017 04-13

5
推荐指数

1
解决办法

2284
查看次数

提高大表的 UPDATE 性能

我在 Amazon RDS（2vCPU，8 GB RAM）上使用 Postgres 9.5。
我使用 pganalyze 来监控我的表现。
我在数据库中有大约 20 万条记录。

在我的仪表板中，我看到以下查询的平均执行时间为 28 秒和 11 秒：

UPDATE calls SET ... WHERE calls.uuid = ?   telephonist 28035.41    0.01    100%    0.03%

UPDATE calls SET sip_error = ? WHERE calls.uuid = ? telephonist 11629.89    0.44    100%    0.69%

Run Code Online (Sandbox Code Playgroud)

我已经尝试VACUUM、发现并清理了 7,670 个死行。
任何想法如何提高UPDATE性能？这是查询：

UPDATE calls SET X=Y WHERE calls.uuid = 'Z'

Run Code Online (Sandbox Code Playgroud)

如何改进上述查询？我可以添加另一个字段吗？例子：

UPDATE calls SET X=Y WHERE calls.uuid = 'Z' AND calls.campaign = 'W'

Run Code Online (Sandbox Code Playgroud)

该列uuid未编入索引。
https://www.tutorialspoint.com/postgresql/postgresql_indexes.htm建议不建议将索引用于 …

postgresql performance index-tuning update amazon-rds postgresql-performance

cal*_*ian

2020 01-08

5
推荐指数

1
解决办法

2万
查看次数

4300 万张 PostgreSQL 表上的复合索引

这个问题与我之前问过的问题有关：PostgreSQL 中复合索引中的列顺序（和查询顺序）

我想我可以在这里尖锐和限制我的问题，而不是超载这个问题。鉴于以下查询（和 EXPLAIN ANALYZE），我正在创建的复合索引有帮助吗？

第一个查询仅使用简单索引（大纲上的 GIST）和（pid 上的 BTREE）运行。

查询是：

EXPLAIN ANALYZE SELECT DISTINCT ON (path) oid, pid, product_name, type, path, size 
FROM portal.inventory AS inv 
WHERE ST_Intersects(st_geogfromtext('SRID=4326;POLYGON((21.51947021484375 51.55059814453125, 18.9129638671875 51.55059814453125, 18.9129638671875 48.8287353515625, 21.51947021484375 48.8287353515625, 21.51947021484375 51.55059814453125))'), inv.outline) 
AND (inv.pid in (20010,20046))

Run Code Online (Sandbox Code Playgroud)

——

结果如下（速度更快，但也许这只是因为数据库是热的）。

"Unique  (cost=581.76..581.76 rows=1 width=89) (actual time=110.436..110.655 rows=249 loops=1)"
"  ->  Sort  (cost=581.76..581.76 rows=1 width=89) (actual time=110.434..110.477 rows=1377 loops=1)"
"        Sort Key: path"
"        Sort Method: quicksort  Memory: 242kB"
"        ->  Bitmap Heap Scan on inventory …

Run Code Online (Sandbox Code Playgroud)

postgresql index index-tuning

Dr.*_*YSG

2017 04-13

4
推荐指数

1
解决办法

535
查看次数

优化对两个大表的查询

我的系统中有一个非常重要的查询，由于表上的数据量很大，执行时间太长。我是一名初级 DBA，我需要为此进行最佳优化。每个表大约有 8000 万行。

表是：

tb_pd：

   Column            |  Type   | Modifiers | Storage | Stats target | Description 
---------------------+---------+-----------+---------+--------------+-------------
 pd_id               | integer | not null  | plain   |              | 
 st_id               | integer |           | plain   |              | 
 status_id           | integer |           | plain   |              | 
 next_execution_date | bigint  |           | plain   |              | 
 priority            | integer |           | plain   |              | 
 is_active           | integer |           | plain   |              | 
Indexes:
    "pk_pd" PRIMARY KEY, btree (pd_id)
    "idx_pd_order" btree (priority, next_execution_date) …

Run Code Online (Sandbox Code Playgroud)

postgresql optimization index-tuning query-performance

Iva*_*Paz

2020 01-08

4
推荐指数

1
解决办法

5466
查看次数

Postgres 中索引列的查询速度极慢

我对索引列的查询速度非常慢。鉴于查询

SELECT * 
FROM orders 
WHERE shop_id = 3828 
ORDER BY updated_at desc 
LIMIT 1

Run Code Online (Sandbox Code Playgroud)

explain analyze 回来：

    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..594.45 rows=1 width=175) (actual time=202106.830..202106.831 rows=1 loops=1)
   ->  Index Scan Backward using index_orders_on_updated_at on orders  (cost=0.43..267901.54 rows=451 width=175) (actual time=202106.827..202106.827 rows=1 loops=1)
         Filter: (shop_id = 3828)
         Rows Removed by Filter: 1604818
 Planning time: 98.579 ms
 Execution time: 202127.514 ms
(6 rows)

Run Code Online (Sandbox Code Playgroud)

表说明为：

                                         Table "public.orders"
       Column       |            Type             |                           Modifiers
--------------------+-----------------------------+---------------------------------------------------------------
 id                 | integer                     | not null default nextval('orders_id_seq'::regclass) …

Run Code Online (Sandbox Code Playgroud)

postgresql performance order-by index-tuning amazon-rds

dav*_*ids

2019 12-30

4
推荐指数

2
解决办法

2478
查看次数

加速数百万行的计数查询

假设一个充满产品的数据库。一个产品可以恰好属于 1 个集合并且由用户创建。数据库的粗略规模：

产品：52.000.000
收藏：9.000.000
用户：大约 9.000.000

我正在尝试检索用户拥有的产品+集合的数量，以及每个集合中的产品数量（该信息应该在所有 x 天生成并在 ElasticSearch 中编入索引）。

对于用户查询，我目前正在做这样的事情：

      SELECT
        users.*,
        (SELECT
          count(*)
        FROM
          products product
        WHERE
          product.user_id = user.id
        ) AS product_count,
        (SELECT
          count(*)
        FROM
          collections collection
        WHERE
          collection.user_id = user.id
        ) AS collection_count
      FROM
        users user

Run Code Online (Sandbox Code Playgroud)

所有 *_id 字段都已编入索引。使用解释（分析，详细）（删除敏感信息）：

 Limit  (cost=0.00..156500.97 rows=100 width=41) (actual time=0.064..28345.363 rows=100 loops=1)
   Output: (...), ((SubPlan 1)), ((SubPlan 2))
   ->  Seq Scan on public.users user  (cost=0.00..14549429167.11 rows=9296702 width=41) (actual time=0.064..28345.241 rows=100 loops=1)
         Output: (...), (SubPlan 1), (SubPlan 2)
         SubPlan 1 …

Run Code Online (Sandbox Code Playgroud)

postgresql performance index count postgresql-performance

dvc*_*crn

2020 01-08

4
推荐指数

1
解决办法

3691
查看次数

优化简单 SELECT 查询的缓慢性能

我有一个名为“链接”的应用程序，其中 1) 用户聚集在群组中并添加其他人，2) 在上述群组中为彼此发布内容。组由links_group我的 postgresql 9.6.5 DB 中的表定义，而他们在这些中发布的回复由links_reply表定义。总体而言，DB 的性能非常好。

然而SELECT，links_reply表上的一个查询始终显示在slow_log 中。它花费的时间超过 500 毫秒，并且比我在大多数其他 postgresql 操作中遇到的速度慢约 10 倍。

我使用 Django ORM 来生成查询。这里的ORM电话：replies = Reply.objects.select_related('writer__userprofile').filter(which_group=group).order_by('-submitted_on')[:25]。本质上，这是为给定的组对象选择最新的 25 条回复。它还选择关联user和userprofile对象。

这是我的慢日志中相应 SQL 的示例：LOG: duration: 8476.309 ms 语句：

SELECT

    "links_reply"."id",             "links_reply"."text", 
    "links_reply"."which_group_id", "links_reply"."writer_id",
    "links_reply"."submitted_on",   "links_reply"."image",
    "links_reply"."device",         "links_reply"."category", 

    "auth_user"."id",               "auth_user"."username", 

    "links_userprofile"."id",       "links_userprofile"."user_id",
    "links_userprofile"."score",    "links_userprofile"."avatar" 

FROM 

    "links_reply" 
    INNER JOIN "auth_user" 
        ON ("links_reply"."writer_id" = "auth_user"."id") 
    LEFT OUTER JOIN "links_userprofile" 
        ON ("auth_user"."id" = "links_userprofile"."user_id") 
WHERE …

Run Code Online (Sandbox Code Playgroud)

postgresql performance upgrade postgresql-9.6 query-performance

Has*_*aig

2020 01-08

4
推荐指数

1
解决办法

1万
查看次数

select 查询的非确定性性能，在 10 亿行的表上从 1s 到 60s

我正在尝试调查为什么此查询的性能如此不确定。它可能需要 1 秒到 60 秒及以上的任何时间。查询的本质是选择一个“时间窗口”，并从该时间窗口内获取所有行。

这是有问题的查询，在大约 10 亿行的表上运行：

SELECT CAST(extract(EPOCH from ts)*1000000 as bigint) as ts
    , ticks
    , quantity
    , side
FROM order_book
WHERE ts >= TO_TIMESTAMP(1618882633073383/1000000.0)
    AND ts < TO_TIMESTAMP(1618969033073383/1000000.0)
    AND zx_prod_id = 0
ORDER BY ts ASC, del desc;

Run Code Online (Sandbox Code Playgroud)

这就是表的创建方式

CREATE TABLE public.order_book
(
    ts timestamp with time zone NOT NULL,
    zx_prod_id smallint NOT NULL,
    ticks integer NOT NULL,
    quantity integer NOT NULL,
    side boolean NOT NULL,
    del boolean NOT NULL
)

Run Code Online (Sandbox Code Playgroud)

TO_TIMESTAMP当我走整张桌子时，其中的值将继续向前滑动。以下是EXPLAIN ANALYZE两个不同时间窗口上相同查询的输出： …

postgresql cache explain timescaledb postgresql-performance

val*_*mit

2021 05-04

4
推荐指数

1
解决办法

91
查看次数

UNIQUE 约束的事务策略？

如何使用基于交易的策略来确保同一地点在同一天不能被多次预订？

有人向我建议，每个隔离级别都会有所不同。你能为他们每个人添加一个例子吗？( read committed,repeatable read和serializable)。我想了解他们中的每一个人。

以下是表格和测试数据：

CREATE TABLE place (
  place_id INT                   PRIMARY KEY,
  Name     CHARACTER VARYING(50) NOT NULL,
  Type     CHARACTER VARYING(50) NOT NULL
);

CREATE TABLE visit (
  visit_id SERIAL PRIMARY KEY,
  place_id INT NOT NULL,
  place_dt TIMESTAMP NOT NULL,

  FOREIGN KEY (place_id) REFERENCES place(place_id)
);

INSERT INTO place(place_id, Name, Type
) VALUES
    (1, 'Denali', 'mountain'),
    (2, 'Brindley', 'mountain'),
    (3, 'St. Louis Cathedral', 'church')
;

INSERT INTO visit(place_id, place_dt
) VALUES
    (1, '2019-01-02 10:00'), …

Run Code Online (Sandbox Code Playgroud)

postgresql concurrency transaction isolation-level

Pak*_*Lui

2021 06-05

4
推荐指数

2
解决办法

250
查看次数

静态大型 PostgreSQL 表的查询性能

我试图尽可能详细地说明这一点。抱歉长度！

背景

protein_snp_assoc我在 PostgreSQL（版本 12.13）数据库上创建了以下分区表：

CREATE TABLE protein_snp_assoc (
  protein_id    int not null,
  snp_id        int not null,
  beta          double precision,
  se            double precision,
  logp          double precision
) PARTITION BY RANGE (snp_id);

Run Code Online (Sandbox Code Playgroud)

然后，我根据以下模板创建了 51 个分区，每个分区包含大约 1.5 亿行（总共 76.5 亿行）：

CREATE TABLE IF NOT EXISTS protein_snp_assoc_(x) PARTITION OF protein_snp_assoc
  FOR VALUES FROM (y) TO (z);

Run Code Online (Sandbox Code Playgroud)

其中x范围从 1 到 51，并y, z定义间隔，每个长度为 150,000。例如，前两个和最后一个分区是：

protein_snp_assoc_1 FOR VALUES FROM (1) TO (150001),
protein_snp_assoc_2 FOR VALUES FROM (150001) TO (300001), ...
protein_snp_assoc_51 …

Run Code Online (Sandbox Code Playgroud)

postgresql database-design read-only-database query-performance postgresql-performance

jom*_*mmi

2023 03-13

4
推荐指数

1
解决办法

901
查看次数

标签统计

postgresql ×10

index-tuning ×4

performance ×4

postgresql-performance ×4

query-performance ×3

amazon-rds ×2

index ×2

cache ×1

composite-types ×1

concurrency ×1

count ×1

database-design ×1

datatypes ×1

explain ×1

isolation-level ×1

optimization ×1

order-by ×1

postgresql-9.6 ×1

read-only-database ×1

spatial ×1

timescaledb ×1

transaction ×1

update ×1

upgrade ×1

背景

标签 统计

标签统计