Kzq*_*qai 14 mysql sql distinct query-optimization
所以我有其他人写的这个查询,我正在尝试重构,这为项目提供了一些功能/材料(通常是鞋子).
有很多产品,因此有很多连接表条目,但只有少数几个可用的功能.我认为必须有一种方法可以减少触及"大"项目列表的需要,以获得这些功能,我听说要明确避免,但我没有可以替换此处"不同"选项的语句.
根据我的日志,我的结果时间很慢:
Query_time:7 Lock_time:0 Rows_sent:32 Rows_examined:5362862
Query_time:8 Lock_time:0 Rows_sent:22 Rows_examined:6581994
正如消息所说,有时它需要7或8秒,有时或每次查询超过500万行.
这可能是由于同时发生的其他负载,因为这里是直接从mysql命令行在数据库上运行的选择:
mysql> SELECT DISTINCT features.FeatureId, features.Name
FROM features, itemsfeatures, items
WHERE items.FlagStatus != 'U'
AND items.TypeId = '13'
AND features.Type = 'Material'
AND features.FeatureId = itemsfeatures.FeatureId
ORDER BY features.Name;
+-----------+--------------------+
| FeatureId | Name |
+-----------+--------------------+
| 40 | Alligator |
| 41 | Burnished Calfskin |
| 42 | Calfskin |
| 59 | Canvas |
| 43 | Chromexcel |
| 44 | Cordovan |
| 57 | Cotton |
| 45 | Crocodile |
| 58 | Deerskin |
| 61 | Eel |
| 46 | Italian Leather |
| 47 | Lizard |
| 48 | Nappa |
| 49 | NuBuck |
| 50 | Ostrich |
| 51 | Patent Leather |
| 60 | Rubber |
| 52 | Sharkskin |
| 53 | Silk |
| 54 | Suede |
| 56 | Veal |
| 55 | Woven |
+-----------+--------------------+
22 rows in set (0.00 sec)
mysql> select count(*) from features;
+----------+
| count(*) |
+----------+
| 122 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*) from itemsfeatures;
+----------+
| count(*) |
+----------+
| 38569 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*) from items;
+----------+
| count(*) |
+----------+
| 8656 |
+----------+
1 row in set (0.00 sec)
explain SELECT DISTINCT features.FeatureId, features.Name FROM features, itemsfeatures, items WHERE items.FlagStatus != 'U' AND items.TypeId = '13' AND features.Type = 'Material' AND features.FeatureId = itemsfeatures.FeatureId ORDER BY features.Name;
+----+-------------+---------------+------+-------------------+-----------+---------+---------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+------+-------------------+-----------+---------+---------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | features | ref | PRIMARY,Type | Type | 33 | const | 21 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | itemsfeatures | ref | FeatureId | FeatureId | 4 | sherman_live.features.FeatureId | 324 | Using index; Distinct |
| 1 | SIMPLE | items | ALL | TypeId,FlagStatus | NULL | NULL | NULL | 8656 | Using where; Distinct; Using join buffer |
+----+-------------+---------------+------+-------------------+-----------+---------+---------------------------------+------+----------------------------------------------+
3 rows in set (0.04 sec)
Run Code Online (Sandbox Code Playgroud)
编辑:
以下是没有distinct的示例结果(但有限制,因为否则它只是挂起)用于比较:
SELECT features.FeatureId, features.Name FROM features, itemsfeatures, items WHERE items.FlagStatus != 'U' AND items.TypeId = '13' AND features.Type = 'Material' AND features.FeatureId = itemsfeatures.FeatureId ORDER BY features.Name limit 10;
+-----------+-----------+
| FeatureId | Name |
+-----------+-----------+
| 40 | Alligator |
| 40 | Alligator |
| 40 | Alligator |
| 40 | Alligator |
| 40 | Alligator |
| 40 | Alligator |
| 40 | Alligator |
| 40 | Alligator |
| 40 | Alligator |
| 40 | Alligator |
+-----------+-----------+
10 rows in set (23.30 sec)
Run Code Online (Sandbox Code Playgroud)
这里使用的是group而不是select distinct:
SELECT features.FeatureId, features.Name FROM features, itemsfeatures, items WHERE items.FlagStatus != 'U' AND items.TypeId = '13' AND features.Type = 'Material' AND features.FeatureId = itemsfeatures.FeatureId group by features.name ORDER BY features.Name;
+-----------+--------------------+
| FeatureId | Name |
+-----------+--------------------+
| 40 | Alligator |
| 41 | Burnished Calfskin |
| 42 | Calfskin |
| 59 | Canvas |
| 43 | Chromexcel |
| 44 | Cordovan |
| 57 | Cotton |
| 45 | Crocodile |
| 58 | Deerskin |
| 61 | Eel |
| 46 | Italian Leather |
| 47 | Lizard |
| 48 | Nappa |
| 49 | NuBuck |
| 50 | Ostrich |
| 51 | Patent Leather |
| 60 | Rubber |
| 52 | Sharkskin |
| 53 | Silk |
| 54 | Suede |
| 56 | Veal |
| 55 | Woven |
+-----------+--------------------+
22 rows in set (13.28 sec)
Run Code Online (Sandbox Code Playgroud)
...因为我试图理解这个一般性问题,除了这个查询特别容易导致的缓慢之外,如何更换错误选择一般的不同查询.
我想知道选择不同的替换是否通常是一组(虽然在这种情况下,这不是一个全面的解决方案,因为它仍然很慢)?
看起来你错过了链接itemsfeatures到的JOIN条件items.如果使用显式JOIN操作编写查询,则更为明显.
SELECT DISTINCT f.FeatureId, f.Name
FROM features f
INNER JOIN itemsfeatures ifx
ON f.FeatureID = ifx.FeatureID
INNER JOIN items i
ON ifx.ItemID = i.ItemID /* This is the part you're missing */
WHERE i.FlagStatus != 'U'
AND i.TypeId = '13'
AND f.Type = 'Material'
ORDER BY f.Name;
Run Code Online (Sandbox Code Playgroud)
小智 6
正如乔所说,似乎确实缺少连接条件
这是您当前的查询
SELECT DISTINCT
features.FeatureId,
features.Name
FROM features,
itemsfeatures,
items
WHERE items.FlagStatus != 'U'
AND items.TypeId = '13'
AND features.Type = 'Material'
AND features.FeatureId = itemsfeatures.FeatureId
ORDER BY features.Name
Run Code Online (Sandbox Code Playgroud)
这是您使用显式连接的查询
SELECT DISTINCT
features.FeatureId,
features.Name
FROM features INNER JOIN
itemsfeatures on features.FeatureId = itemsfeatures.FeatureId CROSS JOIN
items
WHERE items.FlagStatus != 'U'
AND items.TypeId = '13'
AND features.Type = 'Material'
ORDER BY features.Name
Run Code Online (Sandbox Code Playgroud)
我不能100%确定,但看起来删除对items表的任何引用应该给你完全相同的结果
SELECT DISTINCT
features.FeatureId,
features.Name
FROM features,
itemsfeatures
WHERE features.Type = 'Material'
AND features.FeatureId = itemsfeatures.FeatureId
ORDER BY features.Name
Run Code Online (Sandbox Code Playgroud)
编写查询的方式似乎需要一个typeID为13且Flagstatus <> U的项目的材料列表.如果是这种情况,orignial查询返回的结果是错误的.它只是返回所有物品的所有材料.
因此,Joe表示为项添加内连接并使用显式连接,因为它们使含义更清晰.我更喜欢使用group by,但distinct会做同样的事情.
SELECT features.FeatureId,
features.Name
FROM features INNER JOIN
itemsfeatures on features.FeatureId = itemsfeatures.FeatureId INNER JOIN
items on itemsfeatures.ItemID = items.ItemID
WHERE items.FlagStatus != 'U'
AND items.TypeId = '13'
AND features.Type = 'Material'
GROUP BY features.FeatureId,
features.Name
ORDER BY features.Name
Run Code Online (Sandbox Code Playgroud)
随着现在排序,现在速度.创建以下三个索引.
FeaturesIndex(Type,FeatureID,Name)
ItemsFeaturesIndex(FeatureId)
ItemsIndex(TypeId,FlagStatus,ItemID)
Run Code Online (Sandbox Code Playgroud)
这应该加快您当前的查询和我列出的查询.