MySQL中按日期时间范围选择非常慢

yiv*_*ivo 1 mysql sql database performance myisam

我的表有posts超过 650 万条记录。每个帖子都使用固定长度表示name。我使用 MySQL Community 5.7,SSD 磁盘,大约 10K-20K IOPS 和 1GB 内存,key-buffer-size设置为 512M(顺便说一句,我主要使用默认 MySQL 配置驱动)。我的资源有限,因此我选择 MyISAM 作为我的存储引擎。我的基准测试表明,就我而言,MyISAM 更快。而且我不太关心数据,因为它可以更新。

所以,这是我的计划信息:

+------------+--------+------------+
| TABLE_NAME | ENGINE | row_format |
+------------+--------+------------+
| posts      | MyISAM | Fixed      |
+------------+--------+------------+

+---------------------+---------------------+------+-----+---------+----------------+
| Field               | Type                | Null | Key | Default | Extra          |
+---------------------+---------------------+------+-----+---------+----------------+
| id                  | int(11) unsigned    | NO   | PRI | NULL    | auto_increment |
| name                | char(30)            | NO   | UNI | NULL    |                |
| worker_id           | tinyint(4) unsigned | NO   | MUL | NULL    |                |
| processing_priority | tinyint(4) unsigned | NO   | MUL | 0       |                |
| last_processed_at   | datetime            | YES  | MUL | NULL    |                |
| scraped_at          | datetime            | NO   | MUL | NULL    |                |
+---------------------+---------------------+------+-----+---------+----------------+

+-------+------------+---------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name            | Seq_in_index | Column_name         | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+---------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| posts |          0 | PRIMARY             |            1 | id                  | A         |     6579588 |     NULL | NULL   |      | BTREE      |         |               |
| posts |          0 | name                |            1 | name                | A         |     6579588 |     NULL | NULL   |      | BTREE      |         |               |
| posts |          1 | last_processed_at   |            1 | last_processed_at   | A         |     6579588 |     NULL | NULL   | YES  | BTREE      |         |               |
| posts |          1 | processing_priority |            1 | processing_priority | A         |           3 |     NULL | NULL   |      | BTREE      |         |               |
| posts |          1 | worker_id           |            1 | worker_id           | A         |          50 |     NULL | NULL   |      | BTREE      |         |               |
| posts |          1 | scraped_at          |            1 | scraped_at          | A         |      234985 |     NULL | NULL   |      | BTREE      |         |               |
+-------+------------+---------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Run Code Online (Sandbox Code Playgroud)

我运行的查询:

SELECT COUNT(*) FROM `posts` WHERE `posts`.`worker_id` = 1 AND (last_processed_at >= '2017-11-04 22:20:27.203761')
Run Code Online (Sandbox Code Playgroud)

MySQL 需要 3676.4ms 来执行这个查询。

查询解释:

+----+-------------+-------+------------+------+-----------------------------+-----------+---------+-------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys               | key       | key_len | ref   | rows   | filtered | Extra       |
+----+-------------+-------+------------+------+-----------------------------+-----------+---------+-------+--------+----------+-------------+
|  1 | SIMPLE      | posts | NULL       | ref  | last_processed_at,worker_id | worker_id | 1       | const | 232621 |    37.45 | Using where |
+----+-------------+-------+------------+------+-----------------------------+-----------+---------+-------+--------+----------+-------------+
Run Code Online (Sandbox Code Playgroud)

您对如何优化有什么想法吗?

Tur*_*uro 5

worker_id您可以使用和创建组合键last_processed_at,替换该worker_id键。

  • 一些细节:目前,MySQL 在表“posts”上仅使用“last_processed_at”和“worker_id”之一。它使用索引通过“worker_id”获取所有行,然后遍历所有这些行来逐一比较“last_processed_at”。这需要时间。如果您创建组合索引“worker_id”+“last_processed_at”,MySQL 将使用组合索引的第二部分按“last_processed_at”和“worker_id”进行过滤,因此速度会快得多。请参阅文档 [MySQL 5.7 多列索引](https://dev.mysql.com/doc/refman/5.7/en/multiple-column-indexes.html) (2认同)