对于我的大表来说，Postgresql 查询非常慢

Question

对于我的大表来说，Postgresql 查询非常慢

bar*_*oma 6 postgresql performance vmware postgresql-9.5 query-performance

我的数据库版本是postgresql 9.5。

create table if not exists request_log
(
    id               bigserial not null constraint app_requests_pkey  primary key,
    request_date     timestamp not null,
    ip               varchar(50),
    start_time       timestamp,
    application_name varchar(200),
    request_path     text,
    display_url      text,
    username         varchar(50)
);

Run Code Online (Sandbox Code Playgroud)

我有一个包含传入 http 请求信息的表。该id列是主键和索引。表没有关系。

所以这个表中有 72320081 行。当我运行计数查询来获取表的计数时，select count(id) from request_log;查询需要 3-5 分钟。

explain(analyze, buffers, format text)该请求的结果是：

Aggregate  (cost=3447214.71..3447214.72 rows=1 width=0) (actual time=135575.947..135575.947 rows=1 loops=1)
  Buffers: shared hit=96 read=2551303
  ->  Seq Scan on request_log  (cost=0.00..3268051.57 rows=71665257 width=0) (actual time=2.517..129032.408 rows=72320081 loops=1)
        Buffers: shared hit=96 read=2551303
Planning time: 0.067 ms
Execution time: 135575.988 ms

Run Code Online (Sandbox Code Playgroud)

这对我来说是非常糟糕的表现。由于性能问题，我无法从 Web 应用程序的表中获取报告。

我的服务器硬件来源是：

操作系统：Linux ubuntu server 16，在 Vmware 上
4核CPU
内存 6Gb
硬盘 120 GB

我晚上运行查询，数据库上没有用户，但速度很慢。如何解决这个问题呢？

Answer 1

Lau*_*lbe 5

计算行数很慢，因为必须访问表的所有行。
计数id甚至更慢，因为 PostgreSQL 首先必须检查是否id为 NULL（NULL 值不计算在内）。

有几个选项可以加快速度：

使用更新版本的 PostgreSQL。

然后你可以获得并行查询，这将使执行成本更高，但速度更快。
使用索引id并保持桌子吸尘良好。

然后您可以获得仅索引扫描。
使用带有计数器的额外表，该计数器使用触发器在大型表上的每个数据修改语句上进行更新。

请参阅我的博客文章进行深入讨论。

归档时间：	5 年，11 月前
查看次数：	16123 次
最近记录：	3 年，11 月前