优化 PostgreSQL(PostGIS) 中的 ST_Intersects

Sac*_*ina 6 django postgresql postgis query-optimization postgresql-9.3

下面的查询大约需要 15 分钟才能显示结果。我想知道为什么?因为数据?或者几何体的顶点?当我尝试使用不同的表(小型形状文件)进行查询时,它运行得很快。

这是查询。(感谢帕特里克):

WITH hi AS (
  SELECT ps.id, ps.brgy_locat, ps.municipali
  FROM evidensapp_polystructures ps
  JOIN evidensapp_seniangcbr fh ON fh.hazard = 'High'
                                 AND ST_Intersects(fh.geom, ps.geom)
), med AS (
  SELECT ps.id, ps.brgy_locat, ps.municipali
  FROM evidensapp_polystructures ps
  JOIN evidensapp_seniangcbr fh ON fh.hazard = 'Medium'
                                 AND ST_Intersects(fh.geom, ps.geom)
  EXCEPT SELECT * FROM hi
), low AS (
  SELECT ps.id, ps.brgy_locat, ps.municipali
  FROM evidensapp_polystructures ps
  JOIN evidensapp_seniangcbr fh ON fh.hazard = 'Low'
                                 AND ST_Intersects(fh.geom, ps.geom)
  EXCEPT SELECT * FROM hi
  EXCEPT SELECT * FROM med
)
SELECT brgy_locat AS barangay, municipali AS municipality, high, medium, low
FROM (SELECT brgy_locat, municipali, count(*) AS high
      FROM hi
      GROUP BY 1, 2) cnt_hi
FULL JOIN (SELECT brgy_locat, municipali, count(*) AS medium
      FROM med
      GROUP BY 1, 2) cnt_med USING (brgy_locat, municipali)
FULL JOIN (SELECT brgy_locat, municipali, count(*) AS low
      FROM low
      GROUP BY 1, 2) cnt_low USING (brgy_locat, municipali);
Run Code Online (Sandbox Code Playgroud)

PostgreSQL 9.3、PostGIS 2.1.5

Polystructures:包含 9847 行:

CREATE TABLE evidensapp_polystructures (
  id serial NOT NULL PRIMARY KEY,
  bldg_name character varying(100) NOT NULL,
  bldg_type character varying(50) NOT NULL,
  brgy_locat character varying(50) NOT NULL,
  municipali character varying(50) NOT NULL,
  province character varying(50) NOT NULL,
  geom geometry(MultiPolygon,32651)
);

CREATE INDEX evidensapp_polystructures_geom_id
  ON evidensapp_polystructures USING gist (geom);
ALTER TABLE evidensapp_polystructures CLUSTER ON evidensapp_polystructures_geom_id;
Run Code Online (Sandbox Code Playgroud)

SeniangCBR:只有 6 行,shapefile 大小(如果重要的话):52,060 KB

CREATE TABLE evidensapp_seniangcbr (
  id serial NOT NULL PRIMARY KEY,
  hazard character varying(16) NOT NULL,
  geom geometry(MultiPolygon,32651)
);

CREATE INDEX evidensapp_seniangcbr_geom_id ON evidensapp_seniangcbr USING gist (geom);
ALTER TABLE evidensapp_seniangcbr CLUSTER ON evidensapp_seniangcbr_geom_id;
Run Code Online (Sandbox Code Playgroud)

所有数据都通过使用LayerMapping实用程序自动加载到数据库中,因为我正在使用Django(GeoDjango)

在此解释分析链接。

我现在没有服务器,我在我的电脑上运行查询。

  • 处理器:Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz(8 个 CPU),~3.6GHz
  • 内存:8192MB RAM
  • 操作系统:Windows 7 64位

Erw*_*ter 1

与我在相关问题下建议和解释的类似,我会使用UNION ALL而不是FULL JOIN在外部SELECT

WITH hi AS (
   SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
   FROM   evidensapp_seniangcbr     fh
   JOIN   evidensapp_polystructures ps ON ST_Intersects(fh.geom, ps.geom)
   WHERE  fh.hazard = 'High'
   GROUP  BY 1, 2, 3
   )
, med AS (
   SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
   FROM   evidensapp_seniangcbr     fh
   JOIN   evidensapp_polystructures ps ON ST_Intersects(fh.geom, ps.geom)
   LEFT   JOIN hi USING (brgy_locat, municipali)
   WHERE  fh.hazard = 'Medium'
   AND    hi.brgy_locat IS NULL
   GROUP  BY 1, 2, 3
   )
TABLE hi

UNION ALL
TABLE med

UNION ALL
   SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
   FROM   evidensapp_seniangcbr     fh
   JOIN   evidensapp_polystructures ps ON ST_Intersects(fh.geom, ps.geom)
   LEFT   JOIN hi  USING (brgy_locat, municipali)
   LEFT   JOIN med USING (brgy_locat, municipali)
   WHERE  fh.hazard = 'Low'
   AND    hi.brgy_locat IS NULL
   AND    med.brgy_locat IS NULL
   GROUP BY 1, 2, 3;
Run Code Online (Sandbox Code Playgroud)

这仅考虑具有相同 的每组行的最高危险级别(brgy_locat, municipali)evidensapp_seniangcbr结果中仅包含与相关危险级别的任何行实际相交的行。此外,计数仅计算实际相交的行。可能有更多行具有相同的行(brgy_locat, municipali)evidensapp_polystructures只是不与相同的危险级别相交,因此被忽略。

选择一种标准方法来排除已在较低级别中的较高危险级别中找到匹配项的行。

LEFT JOIN/IS NULL应该使用 on 索引id并且在这里表现得很好。当然比使用EXCEPT基于整行的方法要快,后者不能使用索引。

指数

不需要像建议的另一个答案那样将bounding_box几何列添加到表中。PostGIS 在现代版本中自动使用(索引支持的)边界框比较。PostGIS 文档:

此函数调用将自动包含边界框比较,该比较将利用几何上可用的任何索引。

事实上,我们已经在您发布的解释输出中看到了索引扫描。

您现有的 GiST 索引evidensapp_polystructures_geom_id应该可以加快查询速度。
旁白:索引的名称可能应该是evidensapp_polystructures_geom_idx.

此外,(brgy_locat, municipali)如果您还没有索引,请创建一个索引:

CREATE INDEX foo_idx ON evidensapp_polystructures (brgy_locat, municipali);
Run Code Online (Sandbox Code Playgroud)

替代LATERAL连接

由于 中只有 6 行evidensapp_seniangcbr,因此LATERAL联接可能会更快:

WITH hi AS (
   SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
   FROM   evidensapp_seniangcbr fh
        , LATERAL (
      SELECT ps.brgy_locat, ps.municipali
      FROM   evidensapp_polystructures ps
      WHERE  ST_Intersects(fh.geom, ps.geom)
      ) ps
   WHERE  fh.hazard = 'High'
   GROUP  BY 1, 2, 3
   )
, med AS (
   SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
   FROM   evidensapp_seniangcbr fh
        , LATERAL (
      SELECT ps.brgy_locat, ps.municipali
      FROM   evidensapp_polystructures ps
      LEFT   JOIN hi USING (brgy_locat, municipali)
      WHERE  hi.brgy_locat IS NULL
      AND    ST_Intersects(fh.geom, ps.geom)
      ) ps
   WHERE  fh.hazard = 'Medium'
   GROUP  BY 1, 2, 3
   )
TABLE hi

UNION ALL
TABLE med

UNION ALL
   SELECT ps.brgy_locat, ps.municipali, fh.hazard, count(*) AS ct
   FROM   evidensapp_seniangcbr fh
        , LATERAL (
      SELECT ps.id, ps.brgy_locat, ps.municipali
      FROM   evidensapp_polystructures ps
      LEFT   JOIN hi  USING (brgy_locat, municipali)
      LEFT   JOIN med USING (brgy_locat, municipali)
      WHERE  hi.brgy_locat IS NULL
      AND    med.brgy_locat IS NULL
      AND    ST_Intersects(fh.geom, ps.geom)
      ) ps
   WHERE  fh.hazard = 'Low'
   GROUP  BY 1, 2, 3;
Run Code Online (Sandbox Code Playgroud)

关于LATERAL连接: