我有两个MySQL查询,一个接一个,运行速度非常快:
查询1
SELECT Ads.AdId FROM Ads, AdsGeometry WHERE
AdsGeometry.AdId = Ads.AdId AND
(ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), AdsGeometry.GeomPoint))
GROUP BY Ads.AdId
Run Code Online (Sandbox Code Playgroud)
此查询以0.0013秒运行,并返回4行.
查询2
SELECT Ads.AdId FROM Ads, AdsHierarchy WHERE
Ads.AdId = AdsHierarchy.ads_AdId AND
AdsHierarchy.locations_LocationId = 148022797
GROUP BY Ads.AdId
Run Code Online (Sandbox Code Playgroud)
此查询以0.0094秒运行,并返回67行(其中3行与上述查询相同).
我试图将这两个查询合并为一个查询,因为稍后,两个查询的结果集应该一起排序,我想使用MySQL进行排序.这是我尝试过的,在它下面,你会发现它也是解释:
SELECT Ads.AdId FROM Ads, AdsHierarchy, AdsGeometry WHERE
Ads.AdId = AdsHierarchy.ads_AdId AND
AdsGeometry.AdId = Ads.AdId AND (
ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), AdsGeometry.GeomPoint) OR
AdsHierarchy.locations_LocationId = 148022797
)
GROUP BY Ads.AdId
id select_type table type possible_keys key key_len ref rows Extra
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 SIMPLE AdsGeometry ALL PRIMARY,GeomPoint,sx_adsgeometry_geompoint NULL NULL NULL 682848 Using temporary; Using filesort
1 SIMPLE Ads eq_ref PRIMARY PRIMARY 4 dbname.AdsGeometry.AdId 1 Using where; Using index
1 SIMPLE AdsHierarchy ref Ads_AdsHierarchy,locations_LocationId Ads_AdsHierarchy 4 dbname.Ads.AdId 1 Using where
Run Code Online (Sandbox Code Playgroud)
虽然此查询返回正确的结果集(68行),但运行需要6.5937秒.如果我理解正确,AdsHierarchy表不使用它的索引,也不使用它AdsGeometry.
有没有办法将两个查询(或可能更多的位置,或这些基于多边形的查询)合并在一起,并保持合理的运行速度?
谢谢!
编辑:一些信息,关于3个表的索引
AdsGeometry表是MyISAM,主键是AdId.
结果SHOW INDEXES FROM AdsGeometry是:
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
AdsGeometry 0 PRIMARY 1 AdId A 682848 NULL NULL BTREE
AdsGeometry 1 Latitude 1 Latitude A NULL NULL NULL BTREE
AdsGeometry 1 Longitude 1 Longitude A NULL NULL NULL BTREE
AdsGeometry 1 GeomPoint 1 GeomPoint A NULL 32 NULL SPATIAL
AdsGeometry 1 sx_adsgeometry_geompoint 1 GeomPoint A NULL 32 NULL SPATIAL
AdsGeometry 1 Latitude_2 1 Latitude A NULL NULL NULL BTREE
AdsGeometry 1 Latitude_2 2 Longitude A NULL NULL NULL BTREE
Run Code Online (Sandbox Code Playgroud)
AdsHierarchy表类型是InnoDB,主键是AdsHierarchyId.
结果SHOW INDEXES FROM AdsHierarchy是:
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
AdsHierarchy 0 PRIMARY 1 AdsHierarchyId A 2479044 NULL NULL BTREE
AdsHierarchy 1 Ads_AdsHierarchy 1 ads_AdId A 2479044 NULL NULL BTREE
AdsHierarchy 1 locations_LocationId 1 locations_LocationId A 123952 NULL NULL BTREE
Run Code Online (Sandbox Code Playgroud)
Ads表类型是InnoDB,主键是AdId.
结果SHOW INDEXES FROM Ads是:
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Ads 0 PRIMARY 1 AdId A 705411 NULL NULL BTREE
Ads 1 Accounts_Ads 1 accounts_AccountId A 2 NULL NULL BTREE
Ads 1 Ads_Locations 1 locations_LocationId A 88176 NULL NULL BTREE
Ads 1 Categories_Ads 1 categories_CategoryId A 16 NULL NULL BTREE
Ads 1 Currencies_Ads 1 currencies_Currency A 2 NULL NULL BTREE
Ads 1 countries_CountryId 1 countries_CountryId A 204 NULL NULL BTREE
Ads 1 ExternalId 1 ExternalId A 705411 NULL NULL BTREE
Ads 1 ExternalId 2 accounts_AccountId A 705411 NULL NULL BTREE
Ads 1 xml_XMLId 1 xml_XMLId A 4 NULL NULL BTREE
Ads 1 streets_StreetId 1 streets_StreetId A 2 NULL NULL YES BTREE
Run Code Online (Sandbox Code Playgroud)
编辑2:使用隐式连接重写查询,并解释:
这是查询,重写为使用隐式连接,但它仍然运行得非常慢(5.503秒)
SELECT a.AdId FROM Ads AS a
JOIN AdsHierarchy AS ah ON a.AdId = ah.ads_AdId
JOIN AdsGeometry AS ag ON a.AdId = ag.AdId
WHERE
ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), ag.GeomPoint)
OR ah.locations_LocationId = 148022797
GROUP BY a.AdId
id select_type table type possible_keys key key_len ref rows Extra
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 SIMPLE a index PRIMARY PRIMARY 4 NULL 627853 Using index
1 SIMPLE ag eq_ref PRIMARY,GeomPoint,sx_adsgeometry_geompoint PRIMARY 8 micasa_dev.a.AdId 1 Using index condition
1 SIMPLE ah ref Ads_AdsHierarchy,locations_LocationId Ads_AdsHierarchy 4 micasa_dev.a.AdId 1 Using where
Run Code Online (Sandbox Code Playgroud)
编辑3:尝试联合两个查询
还尝试了UNION@RobertKoch提供的方法.
以下UNION查询运行速度非常快(0.06秒)
SELECT Ads.AdId FROM Ads, AdsGeometry
WHERE
AdsGeometry.AdId = Ads.AdId AND
ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), AdsGeometry.GeomPoint)
GROUP BY Ads.AdId
UNION
SELECT Ads.AdId FROM Ads, AdsHierarchy WHERE
Ads.AdId = AdsHierarchy.ads_AdId AND
AdsHierarchy.locations_LocationId = 148022797
GROUP BY Ads.AdId
Run Code Online (Sandbox Code Playgroud)
我仍然无法使用此方法,因为稍后我需要根据Ads表对两个查询的合并得到的结果集进行排序.
如果我尝试执行以下操作,查询将再次变得非常慢(3.7秒):
SELECT Ads.AdId FROM Ads WHERE Ads.AdId IN (
SELECT Ads.AdId FROM Ads, AdsGeometry
WHERE
AdsGeometry.AdId = Ads.AdId AND
ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), AdsGeometry.GeomPoint)
GROUP BY Ads.AdId
UNION
SELECT Ads.AdId FROM Ads, AdsHierarchy WHERE
Ads.AdId = AdsHierarchy.ads_AdId AND
AdsHierarchy.locations_LocationId = 148022797
GROUP BY Ads.AdId
) WHERE Ads.AdId > 100000
ORDER BY Ads.ModifiedDate ASC
Run Code Online (Sandbox Code Playgroud)
编辑4:改变UNION所在的位置,似乎可以解决问题
如果我修改上面的UNION查询
SELECT Ads.AdId
FROM Ads,
(SELECT Ads.AdId
FROM Ads,
AdsGeometry
WHERE AdsGeometry.AdId = Ads.AdId
AND ST_CONTAINS(GeomFromText('Polygon((
-4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018,
-4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882,
-4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563
))'), AdsGeometry.GeomPoint)
GROUP BY Ads.AdId
UNION SELECT Ads.AdId
FROM Ads,
AdsHierarchy
WHERE Ads.AdId = AdsHierarchy.ads_AdId
AND AdsHierarchy.locations_LocationId = 148022797
GROUP BY Ads.AdId) AS nt
WHERE Ads.AdId = nt.AdId
AND Ads.AdId > 1000000
ORDER BY Ads.ModifiedDate ASC
Run Code Online (Sandbox Code Playgroud)
然后查询再次快速运行(~0.0007秒).
如果没有解决方案还没有
UNION,我愿意给赏金的人谁可以解释两者之间的区别UNION版本(这个和那个在,编辑3),并且向我解释,为什么在查询运行速度快,当它被写入按以下顺序排列,并按以上顺序写入时运行缓慢.
如果需要任何其他信息,请在评论中询问,我尝试提供它们!谢谢
*注意:*我已经为两个UNION查询添加了一个ORDER,以使其更清晰,虽然我只是AdId从表中选择,但我仍然需要表中的其他字段Ads.
编辑5:@bovko的请求
1 SIMPLE Ads index NULL countries_CountryId 2 NULL 627853 Using index; Using temporary
1 SIMPLE ag eq_ref PRIMARY PRIMARY 8 micasa_dev.Ads.AdId 1 Using where; Distinct
1 SIMPLE ah ref Ads_AdsHierarchy Ads_AdsHierarchy 4 micasa_dev.Ads.AdId 1 Using where; Distinct
Run Code Online (Sandbox Code Playgroud)
IN ( SELECT ... )通常效率很低。躲开它。
到目前为止,所有的答案都比他们需要的更加努力。在之前似乎JOINs都是不必要的。请参阅下面的更多注释。UNION
SELECT Ads.AdId
FROM Ads,
JOIN (
( SELECT AdId
FROM AdsGeometry
WHERE ST_CONTAINS(GeomFromText('Polygon(( -4.9783515930176 36.627100703563,
-5.0075340270996 36.61222072018, -4.9896812438965 36.57638676015,
-4.965991973877 36.579419508882, -4.955005645752 36.617732160006,
-4.9783515930176 36.627100703563 ))'),
AdsGeometry.GeomPoint)
AND AdId > 1000000 )
UNION DISTINCT
( SELECT ads_AdId AS AdId
FROM AdsHierarchy
WHERE locations_LocationId = 148022797
AND ads_AdId > 1000000 )
) AS nt ON Ads.AdId = nt.AdId
ORDER BY Ads.ModifiedDate ASC
Run Code Online (Sandbox Code Playgroud)
笔记:
AdsGeometry都有AdsHierarchyadId(名称不同);不需要JOIN在内部查询中执行 ,除非可能验证它是否存在于 中Ads。这是一个问题吗?无论如何,我的查询将在外部处理SELECT's JOIN。UNION DISTINCT是必需的,因为两者SELECTs可能获取相同的 id。> 1000000内部以减少 收集的值的数量UNION。UNION(在较旧版本的 MySQL 中)或有时(在较新版本中)创建临时表。你被困住了。IN ( SELECT ... )通常优化得非常好;躲开它。ORDER BY中。UNION括号清楚地表明了它属于什么。ModifiedDate排序。您可以通过删除该要求来加快速度。(UNION可能会创建一个临时表;这ORDER BY可能会创建另一个表。)