我们有一个MyISAM表,大约有75百万行,有5列:
id (int),
user_id(int),
page_id (int),
type (enum with 6 strings)
date_created(datetime).
Run Code Online (Sandbox Code Playgroud)
我们在ID列上有一个主索引,一个唯一索引(user_id,page_id,date_created)和一个复合索引(page_id,date_created)
问题是以下查询最多需要90秒才能完成
SELECT SQL_NO_CACHE user_id, count(id) nr
FROM `table`
WHERE `page_id`=301
and `date_created` BETWEEN '2012-01-03' AND '2012-02-03 23:59:59'
AND page_id<>user_id
group by `user_id`
Run Code Online (Sandbox Code Playgroud)
这是此查询的解释
+----+-------------+----------------------------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | table | range | page_id | page_id | 12 | NULL | 520024 | Using where; Using temporary; Using filesort |
+----+-------------+----------------------------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
Run Code Online (Sandbox Code Playgroud)
编辑: 在ypercube的建议下,我尝试添加一个新索引(page_id,user_id,date_created).但是mysql不使用它默认,所以我不得不建议查询优化器.这是新查询和解释:
SELECT SQL_NO_CACHE user_id, count(*) nr FROM `table` USE INDEX (usridexp) WHERE `page_id`=301 and `date_created` BETWEEN '2012-01-03' AND '2012-02-03 23:59:59' AND page_id<>user_id group by `user_id` ORDER BY NULL
+----+-------------+----------------------------+------+---------------+----------+---------+-------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------------+------+---------------+----------+---------+-------+---------+--------------------------+
| 1 | SIMPLE | table | ref | usridexp | usridexp | 4 | const | 3943444 | Using where; Using index |
+----+-------------+----------------------------+------+---------------+----------+---------+-------+---------+--------------------------+
Run Code Online (Sandbox Code Playgroud)
一些可能改进查询的更改:
更改COUNT(id)到COUNT(*).因为id(我猜)PRIMARY KEY NOT NULL,结果将是相同的.
添加一个ORDER BY NULL后续GROUP BY条款.在MySQL中,按操作分组也会对结果进行排序,除非您指定其他方式.
这(page_id, date_created)可能是MySQL可以用于此查询的最佳索引,但您也可以尝试(page_id, user_id, date_created)(如果添加此索引,还可以发布EXPLAIN吗?)
另一个与此查询的性能无关的事情:
如果您(user_id, page_id, date_created)是UNIQUE和id自动生成(并且不用于除主键之外的任何其他内容),您可以将其作为PRIMARY KEY并删除id列.少一个索引,每行少4个字节.
| 归档时间: |
|
| 查看次数: |
5615 次 |
| 最近记录: |