如何加快对地理位置过程的查询

xan*_*ngr 5 postgresql index spatial gist-index

我有一个包含 10,301,390 个 GPS 记录、城市、国家和 IP 地址块的表。我有用户当前的经纬度位置。我创建了这个查询:

SELECT
  *, point(45.1013021, 46.3021011) <@> point(latitude, longitude) :: point AS distance
FROM
  locs
WHERE
  (
    point(45.1013021, 46.3021011) <@> point(latitude, longitude)
  ) < 10 -- radius
ORDER BY
  distance LIMIT 1;
Run Code Online (Sandbox Code Playgroud)

这个查询成功地给了我我想要的东西,但它很慢。根据给定的纬度和经度,获得一条记录需要 2 到 3 秒。

我在latitudelongitude列上尝试了 B 树索引,也尝试过,GIST( point(latitude, longitude));但查询仍然很慢。

我怎样才能加快这个查询?

更新:

似乎缓慢是由 引起的,ORDER BY但我想获得最短距离,所以问题仍然存在。

joa*_*olo 10

您可以考虑使用基于使用函数的 GIST 索引ll_to_earth。该索引将允许快速“附近”搜索。

CREATE INDEX 
   ON locs USING gist (ll_to_earth(lat, lng));
Run Code Online (Sandbox Code Playgroud)

一旦你有了这个索引,你的查询应该以不同的方式完成。

您的 (lat, lng) 对需要转换为earth类型,并与索引值(相同类型)进行比较。您的查询需要有两个条件,一个是“近似”结果,一个是“精确”结果。第一个将能够使用以前的索引:

SELECT
    *
FROM
    locs
WHERE
    /* First condition allows to search for points at an approximate distance:
       a distance computed using a 'box', instead of a 'circumference'.
       This first condition will use the index.
       (45.1013021, 46.3021011) = (lat, lng) of search center. 
       25000 = search radius (in m)
    */
    earth_box(ll_to_earth(45.1013021, 46.3021011), 25000) @> ll_to_earth(lat, lng) 

    /* This second condition (which is slower) will "refine" 
       the previous search, to include only the points within the
       circumference.
    */
    AND earth_distance(ll_to_earth(45.1013021, 46.3021011), 
             ll_to_earth(lat, lng)) < 25000 ;
Run Code Online (Sandbox Code Playgroud)

要使用此代码,您需要两个扩展(包含在大多数 PostgreSQL 发行版中):

CREATE EXTENSION IF NOT EXISTS cube ;
CREATE EXTENSION IF NOT EXISTS earthdistance;
Run Code Online (Sandbox Code Playgroud)

这是他们的文档:

  • 立方体。您应该查看@> 运算符的说明。下一个需要这个模块。
  • 地球距离。您将在此处找到有关earth_box和 的信息earth_distance。该模块假设地球是球形的,这对于大多数应用程序来说是一个足够好的近似值。

一个由来自自由世界城市数据库的 220 万行组成的表的测试为我提供了对上一个查询的以下答案(与您的不完全相同):

"ru","andra-ata","Andra-Ata","24",,44.9509,46.3327
"ru","andratinskiy","Andratinskiy","24",,44.9509,46.3327
"ru","chernozemelskaya","Chernozemelskaya","24",,44.9821,46.0622
"ru","gayduk","Gayduk","24",,44.9578,46.5244
"ru","imeni beriya","Imeni Beriya","24",,45.0208,46.3906
"ru","imeni kirova","Imeni Kirova","24",,45.2836,46.4847
"ru","kumskiy","Kumskiy","24",,44.9821,46.0622
"ru","kumskoy","Kumskoy","24",,44.9821,46.0622
"ru","lopas","Lopas","17",,44.937,46.1833
"ru","pyatogo dekabrya","Pyatogo Dekabrya","24",,45.1858,46.1656
"ru","svetlyy erek","Svetlyy Erek","24",,45.0079,46.4408
"ru","ulan tuk","Ulan Tuk","24",,45.1542,46.1097
Run Code Online (Sandbox Code Playgroud)

对时间有一个“数量级”的想法:pgAdmin III 告诉我获得这个答案的时间是 22 毫秒。(带有“开箱即用”参数的 PostgreSQL 9.6.1,在装有 Mac OS 10.12、Core i7、SSD 的 Mac 上)