MySQL查询IN()子句索引列缓慢

zmb*_*ush 19 php mysql performance

我有一个由PHP脚本生成的MySQL查询,查询将如下所示:

SELECT * FROM Recipe_Data WHERE 404_Without_200 = 0 AND Failures_Without_Success = 0 AND RHD_No IN (10, 24, 34, 41, 43, 51, 57, 59, 61, 67, 84, 90, 272, 324, 402, 405, 414, 498, 500, 501, 510, 559, 562, 595, 632, 634, 640, 643, 647, 651, 703, 714, 719, 762, 765, 776, 796, 812, 814, 815, 822, 848, 853, 855, 858, 866, 891, 920, 947, 956, 962, 968, 1049, 1054, 1064, 1065, 1070, 1100, 1113, 1119, 1130, 1262, 1287, 1292, 1313, 1320, 1327, 1332, 1333, 1335, 1340, 1343, 1344, 1346, 1349, 1352, 1358, 1362, 1365, 1482, 1495, 1532, 1533, 1537, 1549, 1550, 1569, 1571, 1573, 1574, 1596, 1628, 1691, 1714, 1720, 1735, 1755, 1759, 1829, 1837, 1844, 1881, 1919, 2005, 2022, 2034, 2035, 2039, 2054, 2076, 2079, 2087, 2088, 2089, 2090, 2091, 2092, 2154, 2155, 2156, 2157, 2160, 2162, 2164, 2166, 2169, 2171, 2174, 2176, 2178, 2179, 2183, 2185, 2186, 2187, 2201, 2234, 2236, 2244, 2245, 2250, 2255, 2260, 2272, 2280, 2281, 2282, 2291, 2329, 2357, 2375, 2444, 2451, 2452, 2453, 2454, 2456, 2457, 2460, 2462, 2464, 2465, 2467, 2468, 2469, 2470, 2473, 2474, 2481, 2485, 2487, 2510, 2516, 2519, 2525, 2540, 2545, 2547, 2553, 2571, 2579, 2580, 2587, 2589, 2597, 2602, 2611, 2629, 2660, 2662, 2700, 2756, 2825, 2833, 2835, 2858, 2958, 2963, 2964, 3009, 3090, 3117, 3118, 3120, 3121, 3122, 3123, 3126, 3127, 3129, 3130, 3133, 3135, 3137, 3138, 3139, 3141, 3142, 3145, 3146, 3147, 3151, 3152, 3155, 3193, 3201, 3204, 3219, 3221, 3222, 3223, 3224, 3225, 3226, 3227, 3228, 3229, 3231, 3232, 3233, 3234, 3235, 3237, 3239, 3246, 3250, 3253, 3259, 3261, 3291, 3315, 3328, 3377, 3381, 3383, 3384, 3385, 3387, 3388, 3389, 3390, 3396, 3436, 3463, 3465, 3467, 3470, 3471, 3484, 3507, 3515, 3554, 3572, 3641, 3672, 3683, 3689, 3690, 3692, 3693, 3694, 3697, 3698, 3705, 3711, 3713, 3715, 3716, 3717, 3719, 3720, 3722, 3726, 3727, 3732, 3737, 3763, 3767, 3770, 3771, 3772, 3773, 3803, 3810, 3812, 3816, 3846, 3847, 3848, 3851, 3874, 3882, 3902, 3903, 3906, 3908, 3916, 3924, 3967, 3987, 4006, 4030, 4043, 4045, 4047, 4058, 4067, 4107, 4108, 4114, 4115, 4131, 4132, 4133, 4137, 4138, 4139, 4140, 4141, 4142, 4146, 4150, 4151, 4152, 4153, 4157, 4158, 4160, 4163, 4166, 4167, 4171, 4179, 4183, 4221, 4225, 4242, 4257, 4435, 4437, 4438, 4443, 4446, 4449, 4450, 4451, 4452, 4454, 4460, 4550, 4557, 4618, 4731, 4775, 4804, 4972, 5025, 5026, 5039, 5042, 5294, 5578, 5580, 5599, 5602, 5649, 5726, 5779, 5783, 5931, 5934, 5936, 5939, 5940, 5941, 5978, 6044, 6056, 6113, 6116, 6118, 6122, 6123, 6125, 6127, 6128, 6129, 6130, 6131, 6135, 6141, 6145, 6147, 6150, 6152, 6153, 6154, 6160, 6166, 6169);
Run Code Online (Sandbox Code Playgroud)

RHD_No列是此数据库的主键,总共有大约400,000行.问题是,查询非常慢,通常大约2秒,但我已经看到它长达10秒.

当我尝试解释查询时,一切似乎应该没问题:

+----+-------------+-------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table       | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | Recipe_Data | range | PRIMARY       | PRIMARY | 4       | NULL |  420 | Using where |
+----+-------------+-------------+-------+---------------+---------+---------+------+------+-------------+
Run Code Online (Sandbox Code Playgroud)

当我查询查询时,我得到:

mysql> show profile;
+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.000015 |
| checking query cache for query | 0.000266 |
| Opening tables                 | 0.000009 |
| System lock                    | 0.000004 |
| Table lock                     | 0.000006 |
| init                           | 0.000115 |
| optimizing                     | 0.000038 |
| statistics                     | 0.000797 |
| preparing                      | 0.000047 |
| executing                      | 0.000002 |
| Sending data                   | 2.675270 |
| end                            | 0.000007 |
| query end                      | 0.000003 |
| freeing items                  | 0.000071 |
| logging slow query             | 0.000002 |
| logging slow query             | 0.000058 |
| cleaning up                    | 0.000005 |
+--------------------------------+----------+
Run Code Online (Sandbox Code Playgroud)

我一直在研究这个问题很长一段时间,但我找不到解决方案.这个查询有什么明显的错误吗?我不知道看420行应该花2+秒.

Pet*_* G. 24

您通过主键访问420行,这可能会导致索引访问路径.这可以访问每个键的2个索引页和一个数据页.如果它们在缓存中,则查询应该快速运行.如果没有,那么进入磁盘的每个页面访问都会导致通常的磁盘延迟.如果我们假设5ms磁盘延迟和80%缓存命中,我们得到420*3*0.2*5ms = 1.2秒,这是您所看到的顺序.

  • 基于指标和具体数字而不是概括的数据库性能解释?仍然是我的心脏. (15认同)
  • 那么我将如何改进我的查询呢? (2认同)

DVK*_*DVK 11

问题是IN基本上被视为一堆ORs(例如

col IN (1,2,3)
Run Code Online (Sandbox Code Playgroud)

col = 1 OR col = 2 OR col = 3
Run Code Online (Sandbox Code Playgroud)

这比连接慢很多.

你应该做的是生成创建临时表的SQL代码,用"IN"子句中的值填充它,然后与该临时表连接

CREATE TEMPORARY TABLE numbers (n INT)
Run Code Online (Sandbox Code Playgroud)

然后在循环中添加

INSERT numbers  VALUES ($next_number)
Run Code Online (Sandbox Code Playgroud)

然后在最后

SELECT * FROM numbers, Recipe_Data 
WHERE numbers.n = RHD_No
Run Code Online (Sandbox Code Playgroud)


Jon*_*han 9

您应该将IN子句转换为INNER JOIN子句.

您可以像这样转换查询:

SELECT  foo   
FROM    bar   
WHERE bar.stuff IN  
       (SELECT  stuff FROM asdf)
Run Code Online (Sandbox Code Playgroud)

进入像这样的另一个查询:

SELECT  b.foo 
FROM    ( 
        SELECT  DISTINCT stuff 
        FROM    asdf ) a 
JOIN    bar b 
ON      b.stuff = a.stuff
Run Code Online (Sandbox Code Playgroud)

你将获得很多表现.

当php生成查询时,尝试某种技巧,如IN子句中的项目的临时表.如果可以的话,总是尽量避免使用IN子句,因为它们非常耗时.

  • 您是否可以提供一个链接来解释为什么和/或测量条款中的多少比连接慢? (2认同)