Daw*_*wan 7 performance index sql-server optimization spatial query-performance
我对数据库管理还是个新手,我正在尝试优化搜索查询。
我有一个看起来像这样的查询,在某些情况下需要 5-15 秒来执行,并且还导致 100% 的 CPU 使用率:
DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109015 6.648294)', 4326);
SELECT TOP (1)
[Result].[PointId] AS [PointId],
[Result].[PointName] AS [PointName],
[Result].[LegendTypeId] AS [LegendTypeId],
[Result].[GeoPoint] AS [GeoPoint]
FROM (
SELECT
[Extent1].[GeoPoint].STDistance(@point) AS distance,
[Extent1].[PointId] AS [PointId],
[Extent1].[PointName] AS [PointName],
[Extent1].[LegendTypeId] AS [LegendTypeId],
[Extent1].[GeoPoint] AS [GeoPoint]
FROM [dbo].[GeographyPoint] AS [Extent1]
WHERE 18 = [Extent1].[LegendTypeId]
) AS [Result]
ORDER By [Result].distance ASC
Run Code Online (Sandbox Code Playgroud)
该表在 PK 上有一个聚集索引,在geography类型列上有一个空间索引。
所以当我执行上述查询时,它正在执行扫描操作。
所以我在LegendTypeId列上创建了一个非聚集索引:
CREATE NONCLUSTERED INDEX [GeographyPoint_LegendType_NonClustered] ON [dbo].[GeographyPoint]
(
[LegendTypeId] ASC
)
INCLUDE ( [PointId],
[PointName],
[GeoPoint])
WITH (PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF,
ONLINE = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Run Code Online (Sandbox Code Playgroud)
并将查询更改为:
DECLARE @point geography;
SET @point = geography::STPointFromText('POINT({0} {1})', 4326);
SELECT TOP (1)
[GeoPoint].STDistance(@point) AS distance,
[PointId],
[PointName],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint]
WHERE 18 = [LegendTypeId]
ORDER By distance ASC
Run Code Online (Sandbox Code Playgroud)
现在 SQL Server 执行查找而不是扫描:
在我看来,这提高了查询的效率,但是当我将它部署到生产中时,我仍然得到相同的结果(CPU 使用率高,执行查询的平均时间为 10 秒)。
注意:不会从该表中插入、更新或删除数据——仅搜索/读取。
这是我做错了吗?
我怎样才能解决这个问题?
编辑
索引 Seak 详细信息
编辑2:
我更改了查询,以使用以下方法:来自链接的“最近邻居”:https : //msdn.microsoft.com/en-us/library/ff929109.aspx,现在这是结果,此查询也需要 3 -5 秒搜索 - 类似于第二个查询,(但未在生产中测试)
空间索引设置:
CREATE SPATIAL INDEX [SPATIAL_Point] ON [dbo].[GeographyPoint]
(
[GeoPoint]
)USING GEOGRAPHY_GRID
WITH (GRIDS =(LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM,LEVEL_3 = MEDIUM,LEVEL_4 = MEDIUM),
CELLS_PER_OBJECT = 16, PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE =
OFF, SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF,
ONLINE = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Run Code Online (Sandbox Code Playgroud)
编辑 3
我按照@MickyT 的指示,删除了索引[LegendTypeId],并执行了以下查询:
DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109 6.6482)', 4326);
SELECT TOP (1)
[PointId],
[PointName],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint] WITH(INDEX(SPATIAL_Point))
WHERE
[GeoPoint].STDistance(@point) IS NOT NULL AND
18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)
Run Code Online (Sandbox Code Playgroud)
此查询的统计信息是
然后我再次执行了这个查询:
DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109 6.6482)', 4326);
SELECT TOP (1)
[GeoPoint].STDistance(@point) AS distance,
[PointId],
[PointName],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint] --WITH(INDEX(SPATIAL_Point))
WHERE 18 = [LegendTypeId]
ORDER By distance ASC
Run Code Online (Sandbox Code Playgroud)
此查询的统计信息是
我使用以下设置来运行一些测试。
CREATE TABLE GeographyPoint (
ID INTEGER IDENTITY(1,1) NOT NULL PRIMARY KEY,
GeoPoint GEOGRAPHY NOT NULL,
LegendTypeID INTEGER NOT NULL
);
INSERT INTO GeographyPoint (GeoPoint, LegendTypeID)
SELECT TOP 1000000
Geography::Point(RAND(CAST(NEWID() AS VARBINARY(MAX))) * 2,RAND(CAST(NEWID() AS VARBINARY(MAX))) * 2,4326),
CAST(RAND(CAST(NEWID() AS VARBINARY(MAX))) * 25 AS INTEGER)
FROM Tally;
CREATE INDEX GP_IDX1 ON GeographyPoint(LegendTypeID) INCLUDE (ID, GeoPoint);
CREATE SPATIAL INDEX GP_SIDX ON GeographyPoint(GeoPoint) USING GEOGRAPHY_AUTO_GRID;
Run Code Online (Sandbox Code Playgroud)
这给出了一个包含 1,000,000 个随机点且分布度为 2 x 2 的表。
在尝试了几个不同的选项之后,我可以获得的最佳性能是强制它使用空间索引。有几种方法可以实现这一目标。删除 LegendTypeID 上的索引或使用提示。
您需要决定哪一个最适合您的情况。就我个人而言,我不喜欢使用索引提示,并且如果其他查询不需要其他索引,则会删除其他索引。
查询相互叠加
DECLARE @point geography;
SET @point = geography::Point(1,1,4326);
/*
Clustered index scan (PK)
SQL Server Execution Times:
CPU time = 641 ms, elapsed time = 809 ms
*/
SELECT TOP (1)
[GeoPoint].STDistance(@point) AS distance,
[ID],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint]
WHERE 18 = [LegendTypeId]
ORDER By distance ASC
OPTION(MAXDOP 1)
/*
Index Seek NonClustered (GP_IDX1)
SQL Server Execution Times:
CPU time = 2250 ms, elapsed time = 2806 ms
*/
SELECT TOP (1)
[GeoPoint].STDistance(@point) AS distance,
[ID],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint]
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)
/*
For the next 2 queries
Clustered Index Seek (Spatial)
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 11 ms
*/
SELECT TOP (1)
[GeoPoint].STDistance(@point) AS distance,
[ID],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint] WITH(INDEX(GP_SIDX))
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)
DROP INDEX GP_IDX1 ON [GeographyPoint]
SELECT TOP (1)
[GeoPoint].STDistance(@point) AS distance,
[ID],
[LegendTypeId],
[GeoPoint]
FROM [GeographyPoint]
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)
Run Code Online (Sandbox Code Playgroud)