优化慢查询以更好地使用索引

Question

优化慢查询以更好地使用索引

ima*_*ive 2 mysql performance query-performance

我有一个如下所示的查询：

SELECT time_stop, some_count
        FROM foo
        WHERE user_id = 1
        AND time_start >= '2016-07-27 00:00:00'
        AND time_stop <= '2016-07-27 23:59:59' 
        AND some_count = ( SELECT MAX(some_count) 
          FROM foo
          WHERE user_id = 1
          AND tm_start >= '2016-07-27 00:00:00'
          AND tm_stop <= '2016-07-27 23:59:59'
       );

Run Code Online (Sandbox Code Playgroud)

表架构如下所示：

Create Table: CREATE TABLE `foo` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `user_id` int(11) NOT NULL,
  `time_start` datetime(6) DEFAULT NULL,
  `time_stop` datetime(6) DEFAULT NULL,
  `some_count` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),,
  KEY `user_id` (`user_id`),
  KEY `time_start` (`time_start`)
) ENGINE=InnoDB AUTO_INCREMENT=418005923 DEFAULT CHARSET=latin1

Run Code Online (Sandbox Code Playgroud)

EXPLAIN 输出如下所示：

+----+-------------+----------------+------------+------+------------------+---------+---------+-------+--------+----------+-------------+
| id | select_type | table          | partitions | type | possible_keys      | key     | key_len | ref   | rows   | filtered | Extra       |
+----+-------------+----------------+------------+------+------------------+---------+---------+-------+--------+----------+-------------+
|  1 | PRIMARY     | foo            | NULL       | ref  | user_id,time_start | user_id | 4       | const | 931364 |     1.67 | Using where |
|  2 | SUBQUERY    | foo            | NULL       | ref  | user_id,time_start | user_id | 4       | const | 931364 |    16.66 | Using where |
+----+-------------+----------------+------------+------+------------------+---------+---------+-------+--------+----------+-------------+

Run Code Online (Sandbox Code Playgroud)

我正在使用 MySQL 5.7.11。

我认为我遇到的主要问题是这里的索引键的基数很少。查询试图获取some_count给定日期范围内的最大值。它还datetime需要满足最大值的确切信息。这可能会返回多行，这又回到了小基数问题。

我不确定是否可以重新编写查询以更好地处理现有索引，但我猜我可以。

Answer 1

Mic*_*bot 5

您的查询似乎很简单：

SELECT time_stop, some_count
    FROM foo
    WHERE user_id = 1
    AND time_start >= '2016-07-27 00:00:00'
    AND time_stop <= '2016-07-27 23:59:59'
 ORDER BY some_count DESC LIMIT 1;

Run Code Online (Sandbox Code Playgroud)

您需要 (user_id,time_start) 上的索引——按该顺序——并且，假设 start 总是早于停止，添加看似不必要的AND time_start <= '2016-07-27 23:59:59'将允许数据库甚至避免稍后考虑具有 time_start 值的行。

将 time_stop 添加到同一索引也无济于事，原因与电话簿无法帮助您找到名字以字母 L 开头的所有人相同。将 some_count 添加到索引也无济于事，因为同样的原因。在多列索引中，B 列中的值在 A 列的相同组内排序（因此是电话目录引用）。C 列中的值在 B 的相同组内排序，等等......所以如果 A 是相等比较，而 B 是一个范围，如在您的查询中，这些很容易查明......但 C 的值出现随机的。要使 C 在索引中可用，B 还需要是相等比较，而在您的情况下则不然。

但是，如果这些是您正在阅读的唯一列，那么将这些列添加到索引（在 user_id 和 time_start 的右侧）可能会因不同的原因而有用——当查询使用包含所有选定列的索引时，它是一个“覆盖索引”，整个结果可以直接从索引中读取，而不是从主表中读取，这可能会带来好处，您必须权衡维护更大索引的成本。

归档时间：	9 年，3 月前
查看次数：	57 次
最近记录：	9 年，3 月前