相关疑难解决方法(0)

优化Postgres时间戳查询范围

我有以下表和索引定义:

CREATE TABLE ticket
(
  wid bigint NOT NULL DEFAULT nextval('tickets_id_seq'::regclass),
  eid bigint,
  created timestamp with time zone NOT NULL DEFAULT now(),
  status integer NOT NULL DEFAULT 0,
  argsxml text,
  moduleid character varying(255),
  source_id bigint,
  file_type_id bigint,
  file_name character varying(255),
  status_reason character varying(255),
  ...
)
Run Code Online (Sandbox Code Playgroud)

我在created时间戳上创建了一个索引,如下所示:

CREATE INDEX ticket_1_idx
  ON ticket
  USING btree
  (created );
Run Code Online (Sandbox Code Playgroud)

这是我的疑问

select * from ticket 
where created between '2012-12-19 00:00:00' and  '2012-12-20 00:00:00'
Run Code Online (Sandbox Code Playgroud)

这个工作正常,直到记录数量开始增长(约500万),现在它将永远回归.

解释分析揭示了这一点:

"Index Scan using ticket_1_idx on ticket  (cost=0.00..10202.64 rows=52543 …
Run Code Online (Sandbox Code Playgroud)

postgresql indexing query-optimization database-partitioning postgresql-performance

9
推荐指数
1
解决办法
1万
查看次数

PostgreSQL区别和格式最快的方法

我在表中有350万行acs_objects,我需要检索creation_date具有年份格式和不同的列.

我的第一次尝试:180~200 Sec (15 Rows Fetched)

SELECT DISTINCT to_char(creation_date,'YYYY') FROM acs_objects
Run Code Online (Sandbox Code Playgroud)

我的第二次尝试:35~40 Sec (15 Rows Fetched)

SELECT DISTINCT to_char(creation_date,'YYYY')
FROM (SELECT DISTINCT creation_date FROM acs_objects) AS distinct_date
Run Code Online (Sandbox Code Playgroud)

有没有办法让它更快? - "我需要在ADP网站上使用它"

sql postgresql aggregate distinct postgresql-performance

9
推荐指数
4
解决办法
1508
查看次数

使用TOP BY GROUP BY之类的东西

table1如下表所示

+--------+-------+-------+------------+-------+
| flight |  orig |  dest |  passenger |  bags |
+--------+-------+-------+------------+-------+
|   1111 |  sfo  |  chi  |  david     |     3 |
|   1112 |  sfo  |  dal  |  david     |     7 |
|   1112 |  sfo  |  dal  |  kim       |     10|
|   1113 |  lax  |  san  |  ameera    |     5 |
|   1114 |  lax  |  lfr  |  tim       |     6 |
|   1114 |  lax  |  lfr  |  jake      |     8 | …
Run Code Online (Sandbox Code Playgroud)

sql postgresql aggregate greatest-n-per-group

9
推荐指数
1
解决办法
166
查看次数

PostGis最近邻查询

我想检索另一组点的给定范围内的所有点.比方说,找到距离任何地铁站500米范围内的所有商店.

我写了这个查询,这很慢,并且想要优化它:

SELECT DISCTINCT ON(locations.id) locations.id FROM locations, pois
WHERE pois.poi_kind = 'subway'
AND ST_DWithin(locations.coordinates, pois.coordinates, 500, false);
Run Code Online (Sandbox Code Playgroud)

我正在使用最新版本的Postgres和PostGis(Postgres 9.5,PostGis 2.2.1)

这是表元数据:

                                         Table "public.locations"
       Column       |            Type             |                       Modifiers
--------------------+-----------------------------+--------------------------------------------------------
 id                 | integer                     | not null default nextval('locations_id_seq'::regclass)
 coordinates        | geometry                    |
Indexes:
    "locations_coordinates_index" gist (coordinates)


                                      Table "public.pois"
   Column    |            Type             |                     Modifiers
-------------+-----------------------------+---------------------------------------------------
 id          | integer                     | not null default nextval('pois_id_seq'::regclass)
 coordinates | geometry                    |
 poi_kind_id | integer                     |
Indexes:
    "pois_pkey" PRIMARY KEY, btree (id)
    "pois_coordinates_index" gist (coordinates)
    "pois_poi_kind_id_index" …
Run Code Online (Sandbox Code Playgroud)

sql postgresql postgis nearest-neighbor query-performance

8
推荐指数
1
解决办法
698
查看次数

max()与ORDER BY DESC + LIMIT 1的性能

我今天正在对一些慢速SQL查询进行故障排除,并且不太了解下面的性能差异:

当尝试max(timestamp)基于某些条件从数据表中提取时,使用MAX()ORDER BY timestamp LIMIT 1匹配行存在时慢,但如果找不到匹配的行则相当快.

SELECT timestamp
FROM data JOIN sensors ON ( sensors.id = data.sensor_id )
WHERE sensor.station_id = 4
ORDER BY timestamp DESC
LIMIT 1;
(0 rows)  
Time: 1314.544 ms

SELECT timestamp
FROM data JOIN sensors ON ( sensors.id = data.sensor_id )
WHERE sensor.station_id = 5
ORDER BY timestamp DESC
LIMIT 1;
(1 row)  
Time: 10.890 ms

SELECT MAX(timestamp)
FROM data JOIN sensors ON ( sensors.id = data.sensor_id )
WHERE sensor.station_id …
Run Code Online (Sandbox Code Playgroud)

sql postgresql aggregate max sql-limit

7
推荐指数
1
解决办法
9839
查看次数

SELECT DISTINCT 在我的 PostgreSQL 表上比预期的要慢

这是我的表架构:

CREATE TABLE tickers (
    product_id TEXT NOT NULL,
    trade_id INT NOT NULL,
    sequence BIGINT NOT NULL,
    time TIMESTAMPTZ,
    price NUMERIC NOT NULL,
    side TEXT NOT NULL,
    last_size NUMERIC NOT NULL,
    best_bid NUMERIC NOT NULL,
    best_ask NUMERIC NOT NULL,
    PRIMARY KEY (product_id, trade_id)
);
Run Code Online (Sandbox Code Playgroud)

我的应用程序在“ticker”频道上订阅了 Coinbase Pro 的 websocket,并在收到消息时在行情表中插入一行。

该表现在有近 200 万行。

我认为运行SELECT DISTINCT product_id FROM tickers会很快,但它需要大约 500 到 600 毫秒。这是来自的输出EXPLAIN ANALYZE

HashAggregate  (cost=47938.97..47939.38 rows=40 width=8) (actual time=583.105..583.110 rows=40 loops=1)
  Group Key: product_id
  ->  Seq Scan …
Run Code Online (Sandbox Code Playgroud)

sql postgresql query-optimization database-performance postgresql-performance

7
推荐指数
1
解决办法
478
查看次数

优化分组最大查询

select * 
from records 
where id in ( select max(id) from records group by option_id )
Run Code Online (Sandbox Code Playgroud)

此查询即使在数百万行上也能正常工作.但是从解释声明的结果可以看出:

                                               QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop  (cost=30218.84..31781.62 rows=620158 width=44) (actual time=1439.251..1443.458 rows=1057 loops=1)
->  HashAggregate  (cost=30218.41..30220.41 rows=200 width=4) (actual time=1439.203..1439.503 rows=1057 loops=1)
     ->  HashAggregate  (cost=30196.72..30206.36 rows=964 width=8) (actual time=1438.523..1438.807 rows=1057 loops=1)
           ->  Seq Scan on records records_1  (cost=0.00..23995.15 rows=1240315 width=8) (actual time=0.103..527.914 rows=1240315 loops=1)
->  Index Scan using records_pkey on records  (cost=0.43..7.80 rows=1 width=44) (actual time=0.002..0.003 rows=1 loops=1057)
     Index Cond: (id = (max(records_1.id)))
Total …
Run Code Online (Sandbox Code Playgroud)

sql postgresql query-optimization greatest-n-per-group groupwise-maximum

6
推荐指数
1
解决办法
5435
查看次数

PostgreSQL - 在VIEW上加入慢查询

我正在尝试在表(玩家)和视图(player_main_colors)之间进行简单的连接:

SELECT P.*, C.main_color FROM players P
    OUTER LEFT JOIN player_main_colors C USING (player_id)
    WHERE P.user_id=1;
Run Code Online (Sandbox Code Playgroud)

此查询大约需要40毫秒.

这里我在VIEW上使用嵌套的SELECT而不是JOIN:

SELECT player_id, main_color FROM player_main_colors
    WHERE player_id IN (
        SELECT player_id FROM players WHERE user_id=1);
Run Code Online (Sandbox Code Playgroud)

此查询也需要约40毫秒.

当我将查询分成2个部分时,它会像我预期的那样变快:

SELECT player_id FROM players WHERE user_id=1;

SELECT player_id, main_color FROM player_main_colors
    where player_id in (584, 9337, 11669, 12096, 13651,
        13852, 9575, 23388, 14339, 500, 24963, 25630,
        8974, 13048, 11904, 10537, 20362, 9216, 4747, 25045);
Run Code Online (Sandbox Code Playgroud)

这些查询每个大约需要0.5毫秒.

那么为什么上面的查询与JOIN或子SELECT这么可怕的慢,我该如何修复呢?

以下是有关我的表格和视图的一些详细信息:

CREATE TABLE users (
    user_id INTEGER PRIMARY KEY, …
Run Code Online (Sandbox Code Playgroud)

postgresql performance query-optimization greatest-n-per-group postgresql-performance

6
推荐指数
2
解决办法
4546
查看次数

PostgreSQL not using index on a filtered multiple sort query

I have a pretty simple table

CREATE TABLE approved_posts (
  project_id INTEGER,
  feed_id INTEGER,
  post_id INTEGER,
  approved_time TIMESTAMP NOT NULL,
  post_time TIMESTAMP NOT NULL,
  PRIMARY KEY (project_id, feed_id, post_id)
)
Run Code Online (Sandbox Code Playgroud)

And I'm trying to optimize this query:

SELECT *
FROM approved_posts
WHERE feed_id IN (?, ?, ?)
AND project_id = ?
ORDER BY approved_time DESC, post_time DESC
LIMIT 1;
Run Code Online (Sandbox Code Playgroud)

The query optimizer is fetching every single approved_post that matches the predicate, sorting all 100k results, and returning the top one …

sql sorting postgresql indexing postgresql-performance

6
推荐指数
1
解决办法
3811
查看次数

条件超前/滞后功能PostgreSQL?

我有这样一张桌子:

Name   activity  time

user1  A1        12:00
user1  E3        12:01
user1  A2        12:02
user2  A1        10:05
user2  A2        10:06
user2  A3        10:07
user2  M6        10:07
user2  B1        10:08
user3  A1        14:15
user3  B2        14:20
user3  D1        14:25
user3  D2        14:30
Run Code Online (Sandbox Code Playgroud)

现在,我需要这样的结果:

Name   activity  next_activity

user1  A2        NULL
user2  A3        B1
user3  A1        B2
Run Code Online (Sandbox Code Playgroud)

我想检查每个用户A组的最后一项活动以及接下来B组的活动类型(B组的活动总是在A组活动后进行).其他类型的活动对我来说并不感兴趣.我试过使用该lead()功能,但它没有奏效.

我怎么能解决我的问题?

sql postgresql greatest-n-per-group window-functions

6
推荐指数
2
解决办法
9220
查看次数