改进 DbGeography 查询

Daw*_*wan 7 performance index sql-server optimization spatial query-performance

我对数据库管理还是个新手,我正在尝试优化搜索查询。

我有一个看起来像这样的查询,在某些情况下需要 5-15 秒来执行,并且还导致 100% 的 CPU 使用率:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109015 6.648294)', 4326); 

SELECT TOP (1)
     [Result].[PointId] AS [PointId], 
     [Result].[PointName] AS [PointName], 
     [Result].[LegendTypeId] AS [LegendTypeId], 
     [Result].[GeoPoint] AS [GeoPoint]
FROM ( 
    SELECT 
        [Extent1].[GeoPoint].STDistance(@point) AS distance, 
        [Extent1].[PointId] AS [PointId], 
        [Extent1].[PointName] AS [PointName], 
        [Extent1].[LegendTypeId] AS [LegendTypeId], 
        [Extent1].[GeoPoint] AS [GeoPoint]
    FROM [dbo].[GeographyPoint] AS [Extent1]
    WHERE 18 = [Extent1].[LegendTypeId] 
)  AS [Result]
ORDER By [Result].distance ASC
Run Code Online (Sandbox Code Playgroud)

该表在 PK 上有一个聚集索引,在geography类型列上有一个空间索引。

在此处输入图片说明

所以当我执行上述查询时,它正在执行扫描操作。

在此处输入图片说明

所以我在LegendTypeId列上创建了一个非聚集索引:

CREATE NONCLUSTERED INDEX [GeographyPoint_LegendType_NonClustered] ON [dbo].[GeographyPoint]
(
    [LegendTypeId] ASC
)
INCLUDE (   [PointId],
    [PointName],
    [GeoPoint]) 
    WITH (PAD_INDEX = OFF, 
    STATISTICS_NORECOMPUTE = OFF,
    SORT_IN_TEMPDB = OFF, 
    DROP_EXISTING = OFF,
    ONLINE = OFF,
    ALLOW_ROW_LOCKS = ON, 
    ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Run Code Online (Sandbox Code Playgroud)

并将查询更改为:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT({0} {1})', 4326); 

 SELECT TOP (1) 
     [GeoPoint].STDistance(@point) AS distance, 
     [PointId], 
     [PointName],
     [LegendTypeId], 
     [GeoPoint]
     FROM [GeographyPoint]
 WHERE 18 = [LegendTypeId]
 ORDER By distance ASC
Run Code Online (Sandbox Code Playgroud)

现在 SQL Server 执行查找而不是扫描:

在此处输入图片说明

在我看来,这提高了查询的效率,但是当我将它部署到生产中时,我仍然得到相同的结果(CPU 使用率高,执行查询的平均时间为 10 秒)。

注意:不会从该表中插入、更新或删除数据——仅搜索/读取。

  1. 这是我做错了吗?

  2. 我怎样才能解决这个问题?

编辑

索引 Seak 详细信息

在此处输入图片说明

编辑2:

我更改了查询,以使用以下方法:来自链接的“最近邻居”:https : //msdn.microsoft.com/en-us/library/ff929109.aspx,现在这是结果,此查询也需要 3 -5 秒搜索 - 类似于第二个查询,(但未在生产中测试)

在此处输入图片说明

空间索引设置:

CREATE SPATIAL INDEX [SPATIAL_Point] ON [dbo].[GeographyPoint]
(
[GeoPoint]
)USING  GEOGRAPHY_GRID 
WITH (GRIDS =(LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM,LEVEL_3 = MEDIUM,LEVEL_4 = MEDIUM), 
CELLS_PER_OBJECT = 16, PAD_INDEX = OFF, 
STATISTICS_NORECOMPUTE = 
OFF, SORT_IN_TEMPDB = OFF,
 DROP_EXISTING = OFF, 
 ONLINE = OFF, 
 ALLOW_ROW_LOCKS = ON, 
 ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Run Code Online (Sandbox Code Playgroud)

编辑 3 我按照@MickyT 的指示,删除了索引[LegendTypeId],并执行了以下查询:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109 6.6482)', 4326); 

SELECT TOP (1) 

    [PointId],
    [PointName],
    [LegendTypeId], 
    [GeoPoint]
FROM [GeographyPoint] WITH(INDEX(SPATIAL_Point))
WHERE 
   [GeoPoint].STDistance(@point) IS NOT NULL AND
    18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)
Run Code Online (Sandbox Code Playgroud)

此查询的统计信息是

在此处输入图片说明

然后我再次执行了这个查询:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109 6.6482)', 4326); 

 SELECT TOP (1) 
     [GeoPoint].STDistance(@point) AS distance, 
     [PointId], 
     [PointName],
     [LegendTypeId], 
     [GeoPoint]
     FROM [GeographyPoint] --WITH(INDEX(SPATIAL_Point))
 WHERE 18 = [LegendTypeId]
 ORDER By distance ASC
Run Code Online (Sandbox Code Playgroud)

此查询的统计信息是

在此处输入图片说明

Mic*_*kyT 2

我使用以下设置来运行一些测试。

CREATE TABLE GeographyPoint (
    ID INTEGER IDENTITY(1,1) NOT NULL PRIMARY KEY,
    GeoPoint GEOGRAPHY NOT NULL,
    LegendTypeID INTEGER NOT NULL
    );

INSERT INTO GeographyPoint (GeoPoint, LegendTypeID)
SELECT TOP 1000000 
    Geography::Point(RAND(CAST(NEWID() AS VARBINARY(MAX))) * 2,RAND(CAST(NEWID() AS VARBINARY(MAX))) * 2,4326),
    CAST(RAND(CAST(NEWID() AS VARBINARY(MAX))) * 25 AS INTEGER)
FROM Tally;

CREATE INDEX GP_IDX1 ON GeographyPoint(LegendTypeID) INCLUDE (ID, GeoPoint);
CREATE SPATIAL INDEX GP_SIDX ON GeographyPoint(GeoPoint) USING GEOGRAPHY_AUTO_GRID;
Run Code Online (Sandbox Code Playgroud)

这给出了一个包含 1,000,000 个随机点且分布度为 2 x 2 的表。
在尝试了几个不同的选项之后,我可以获得的最佳性能是强制它使用空间索引。有几种方法可以实现这一目标。删除 LegendTypeID 上的索引或使用提示。
您需要决定哪一个最适合您的情况。就我个人而言,我不喜欢使用索引提示,并且如果其他查询不需要其他索引,则会删除其他索引。

查询相互叠加

DECLARE @point geography;
SET @point = geography::Point(1,1,4326); 
/*
Clustered index scan (PK)
 SQL Server Execution Times:
   CPU time = 641 ms,  elapsed time = 809 ms
*/
SELECT TOP (1) 
    [GeoPoint].STDistance(@point) AS distance, 
    [ID], 
    [LegendTypeId], 
    [GeoPoint]
FROM [GeographyPoint]
WHERE 18 = [LegendTypeId]
ORDER By distance ASC
OPTION(MAXDOP 1)
/*
Index Seek NonClustered (GP_IDX1)
 SQL Server Execution Times:
   CPU time = 2250 ms,  elapsed time = 2806 ms
*/
SELECT TOP (1) 
    [GeoPoint].STDistance(@point) AS distance, 
    [ID], 
    [LegendTypeId], 
    [GeoPoint]
FROM [GeographyPoint]
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
    18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)

/*
For the next 2 queries
Clustered Index Seek (Spatial)
 SQL Server Execution Times:
   CPU time = 15 ms,  elapsed time = 11 ms
*/
SELECT TOP (1) 
    [GeoPoint].STDistance(@point) AS distance, 
    [ID], 
    [LegendTypeId], 
    [GeoPoint]
FROM [GeographyPoint] WITH(INDEX(GP_SIDX))
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
    18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)

DROP INDEX GP_IDX1 ON [GeographyPoint]

SELECT TOP (1) 
    [GeoPoint].STDistance(@point) AS distance, 
    [ID], 
    [LegendTypeId], 
    [GeoPoint]
FROM [GeographyPoint]
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
    18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)
Run Code Online (Sandbox Code Playgroud)