如何提高 timescaledb 获取最后时间戳的性能

Question

如何提高 timescaledb 获取最后时间戳的性能

SELECT timeseries_id, "timestamp" FROM enhydris_timeseriesrecord WHERE timeseries_id=6661 ORDER BY "timestamp" DESC LIMIT 1;

（该表包含约66m条记录，其中timeseries_id=6661的记录约0.5m。）

这个查询大约需要 1-2 秒才能运行，我发现这个时间太多了。

如果它使用简单的 btree 索引，它应该在大约 30 次迭代后找到它正在寻找的内容。据我所知，当我执行EXPLAIN ANALYZE该查询时，它确实使用了索引，但它必须在每个块中都这样做，显然有 1374 个块。

怎样才能让查询变得更快呢？

                 Table "public.enhydris_timeseriesrecord"
    Column     |           Type           | Collation | Nullable | Default 
---------------+--------------------------+-----------+----------+---------
 timeseries_id | integer                  |           | not null | 
 timestamp     | timestamp with time zone |           | not null | 
 value         | double precision         |           |          | 
 flags         | character varying(237)   |           | not null | 
Indexes:
    "enhydris_timeseriesrecord_pk" PRIMARY KEY, btree (timeseries_id, "timestamp")
    "enhydris_timeseriesrecord_timeseries_id_idx" btree (timeseries_id)
    "enhydris_timeseriesrecord_timestamp_idx" btree ("timestamp" DESC)
    "enhydris_timeseriesrecord_timestamp_timeseries_id_idx" btree ("timestamp", timeseries_id)
Foreign-key constraints:
    "enhydris_timeseriesrecord_timeseries_fk" FOREIGN KEY (timeseries_id) REFERENCES enhydris_timeseries(id) DEFERRABLE INITIALLY DEFERRED
Triggers:
    ts_insert_blocker BEFORE INSERT ON enhydris_timeseriesrecord FOR EACH ROW EXECUTE PROCEDURE _timescaledb_internal.insert_blocker()
Number of child tables: 1374 (Use \d+ to list them.)

Run Code Online (Sandbox Code Playgroud)

更新：解释计划

Answer 1

Bla*_*ski 5

数据库必须转到每个块的子索引并检索查找 timeseries_id=x 的最新时间戳。数据库正确使用索引（正如您从解释中看到的），它对每个块中的每个子索引进行索引扫描，而不是完整扫描。所以它会执行 >1000 次索引扫描。无法修剪任何块，因为规划器无法知道哪些块具有特定 timeseries_id 的条目。

并且您有 1300 个块，仅包含 66m 条记录 -> 每个块约 50k 行。每个块的行数太少了。从 Timescale 文档中，他们提出了以下建议：

选择时间间隔的关键属性是属于最近间隔的块（包括索引）（如果使用空间分区则为块）适合内存。因此，我们通常建议设置间隔，使这些块所占主内存的比例不超过 25%。

https://docs.timescale.com/latest/using-timescaledb/hypertables#best-practices

减少块的数量将显着提高查询性能。

此外，如果使用 TimescaleDB 压缩，您可能会获得更高的查询性能，这将进一步减少需要扫描的块数量，您可以按 timeseries_id 进行分段（https://docs.timescale.com/latest/api#compression）或者您可以定义一个连续聚合，它将保存每个 timeseries_id 的最后一项（https://docs.timescale.com/latest/api#continuous-aggregates）

归档时间：	5 年，8 月前
查看次数：	1372 次
最近记录：	5 年，8 月前