ela*_*xsj 5 mysql indexing query-optimization
我有一个日志条目表,以及大约100个可能的日志代码的描述表:
CREATE TABLE `log_entries` (
`logentry_id` int(11) NOT NULL AUTO_INCREMENT,
`date` datetime NOT NULL,
`partner_id` smallint(4) NOT NULL,
`log_code` smallint(4) NOT NULL,
PRIMARY KEY (`logentry_id`),
KEY `IX_code` (`log_code`),
KEY `IX_partner_code` (`partner_id`,`log_code`)
) ENGINE=MyISAM ;
CREATE TABLE IF NOT EXISTS `log_codes` (
`log_code` smallint(4) NOT NULL DEFAULT '0',
`log_desc` varchar(255) DEFAULT NULL,
`category_overview` tinyint(1) NOT NULL DEFAULT '0',
`category_error` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`log_code`),
KEY `IX_overview_code` (`category_overview`,`log_code`),
KEY `IX_error_code` (`category_error`,`log_code`)
) ENGINE=MyISAM ;
Run Code Online (Sandbox Code Playgroud)
以下查询(匹配10k的20k行)在0.0034秒(使用LIMIT 0,20
)中执行:
SELECT log_entries.date, log_codes.log_desc FROM log_entries
INNER JOIN log_codes ON log_codes.log_code = log_entries.log_code
WHERE log_entries.partner_id = 1 AND log_codes.category_overview = 1;
Run Code Online (Sandbox Code Playgroud)
但是当添加时ORDER BY log_entries.logentry_id DESC
,这当然是必要的,它会减慢到0.6秒.可能是因为在log_codes表中使用了"Using temporary"?删除索引实际上使查询执行得更快,但仍然很慢(0.3秒).
没有ORDER BY的EXPLAIN输出:
+----+-------------+-------------+------+----------------------------+------------------+---------+--------------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+----------------------------+------------------+---------+--------------------------+------+-------------+ | 1 | SIMPLE | log_codes | ref | PRIMARY,IX_overview_code | IX_overview_code | 1 | const | 56 | | | 1 | SIMPLE | log_entries | ref | IX_code,IX_partner_code | IX_partner_code | 7 | const,log_codes.log_code | 25 | Using where | +----+-------------+-------------+------+----------------------------+------------------+---------+--------------------------+------+-------------+
包括ORDER BY:
+----+-------------+-------------+------+----------------------------+------------------+---------+--------------------------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+----------------------------+------------------+---------+--------------------------+------+---------------------------------+ | 1 | SIMPLE | log_codes | ref | PRIMARY,IX_overview_code | IX_overview_code | 1 | const | 56 | Using temporary; Using filesort | | 1 | SIMPLE | log_entries | ref | IX_code,IX_partner_code | IX_partner_code | 7 | const,log_codes.log_code | 25 | Using where | +----+-------------+-------------+------+----------------------------+------------------+---------+--------------------------+------+---------------------------------+
有关如何使此查询更快执行的任何提示?我不明白为什么需要"使用临时",因为在获取和排序适当的日志条目之前应该选择日志代码?
更新@Eugen Rieck:
SELECT log_entries.date, lc.log_desc FROM log_entries INNER JOIN (SELECT log_desc, log_code FROM log_codes WHERE category_overview = 1) AS lc ON lc.log_code = log_entries.log_code WHERE log_entries.partner_id = 1 ORDER BY log_entries.logentry_id; +----+-------------+-------------+------+-------------------------+------------------+---------+-------------------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+-------------------------+------------------+---------+-------------------+------+---------------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 57 | Using temporary; Using filesort | | 1 | PRIMARY | log_entries | ref | IX_code,IX_partner_code | IX_partner_code | 7 | const,lc.log_code | 25 | Using where | | 2 | DERIVED | log_codes | ref | IX_overview_code | IX_overview_code | 1 | | 56 | | +----+-------------+-------------+------+-------------------------+------------------+---------+-------------------+------+---------------------------------+
更新@RolandoMySQLDBA:
使用我的原始索引,ORDER BY date DESC:
SELECT log_entries.date, log_codes.log_desc FROM (SELECT log_code,date FROM log_entries WHERE partner_id = 1) log_entries INNER JOIN (SELECT log_code,log_desc FROM log_codes WHERE category_overview = 1) log_codes USING (log_code) ORDER BY log_entries.date DESC; +----+-------------+-------------+------+------------------+------------------+---------+------+-------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+------------------+------------------+---------+------+-------+---------------------------------+ | 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 57 | Using temporary; Using filesort | | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 21937 | Using where; Using join buffer | | 3 | DERIVED | log_codes | ref | IX_overview_code | IX_overview_code | 1 | | 56 | | | 2 | DERIVED | log_entries | ALL | IX_partner_code | NULL | NULL | NULL | 22787 | Using where | +----+-------------+-------------+------+------------------+------------------+---------+------+-------+---------------------------------+
使用索引,没有排序:
SELECT log_entries.date, log_codes.log_desc FROM (SELECT log_code,date FROM log_entries WHERE partner_id = 1) log_entries INNER JOIN (SELECT log_code,log_desc FROM log_codes WHERE category_overview = 1) log_codes USING (log_code); +----+-------------+-------------+-------+-----------------------+-----------------------+---------+------+-------+--------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+-------+-----------------------+-----------------------+---------+------+-------+--------------------------------+ | 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 57 | | | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 21937 | Using where; Using join buffer | | 3 | DERIVED | log_codes | index | IX_overview_code_desc | IX_overview_code_desc | 771 | NULL | 80 | Using where; Using index | | 2 | DERIVED | log_entries | index | IX_partner_code_date | IX_partner_code_date | 15 | NULL | 22787 | Using where; Using index | +----+-------------+-------------+-------+-----------------------+-----------------------+---------+------+-------+--------------------------------+
使用索引,ORDER BY日期DESC:
SELECT log_entries.date, log_codes.log_desc FROM (SELECT log_code,date FROM log_entries WHERE partner_id = 1) log_entries INNER JOIN (SELECT log_code,log_desc FROM log_codes WHERE category_overview = 1) log_codes USING (log_code) ORDER BY log_entries.date DESC; +----+-------------+-------------+-------+-----------------------+-----------------------+---------+------+-------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+-------+-----------------------+-----------------------+---------+------+-------+---------------------------------+ | 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 57 | Using temporary; Using filesort | | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 21937 | Using where; Using join buffer | | 3 | DERIVED | log_codes | index | IX_overview_code_desc | IX_overview_code_desc | 771 | NULL | 80 | Using where; Using index | | 2 | DERIVED | log_entries | index | IX_partner_code_date | IX_partner_code_date | 15 | NULL | 22787 | Using where; Using index | +----+-------------+-------------+-------+-----------------------+-----------------------+---------+------+-------+---------------------------------+
更新@Joe Stefanelli:
SELECT log_entries.date, log_codes.log_desc FROM log_entries INNER JOIN log_codes ON log_codes.log_code = log_entries.log_code WHERE log_entries.partner_id = 1 AND log_codes.category_overview = 1 ORDER BY date DESC; +----+-------------+-------------+------+--------------------------+-----------------+---------+--------------------------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+--------------------------+-----------------+---------+--------------------------+------+----------------------------------------------+ | 1 | SIMPLE | log_codes | ALL | PRIMARY,IX_code_overview | NULL | NULL | NULL | 80 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | log_entries | ref | IX_code,IX_code_partner | IX_code_partner | 7 | log_codes.log_code,const | 25 | Using where | +----+-------------+-------------+------+--------------------------+-----------------+---------+--------------------------+------+----------------------------------------------+
我认为,这里的大多数问题以及类似的问题都来自于对 MySQL(和其他数据库)如何使用索引进行排序的误解。答案是:MySQL不使用索引进行排序,它只是可以按照索引的顺序或相反的方向读取数据。如果您碰巧希望按照当前使用的索引的顺序对数据进行排序 - 您很幸运,否则结果将被排序(因此 EXPLAIN 中的文件排序)
也就是说,整个结果的顺序主要取决于哪个表是连接中的第一个表。如果您查看 EXPLAIN,您会发现连接从“log_codes”表开始(因为它小得多)。
基本上,您需要的是“log_entries”上的复合索引(partner_id,date),“log_codes”的覆盖复合索引(log_code,category_overview,log_desc),将“INNER JOIN”更改为“STRAIGHT_JOIN”以强制连接顺序,并按“日期”DESC 排序(幸运的是,该索引也将被覆盖)。
UPD1:抱歉,我输错了第一个表的索引:它应该是(partner_id, log_code, date)
.
但我仍然很难理解为什么当我尝试对另一个表中的列进行排序时,MySQL 选择在 log_codes 表上“使用临时”(和 100 倍的查询时间)?
MySQL可以直接输出数据,只要你同意它获取数据的顺序,或者将数据放在临时表中,然后排序并输出。当您在连接中按任何非第一个表中的字段排序时,MySQL 必须对数据进行排序(不仅仅是按索引的顺序输出),并且为了对数据进行排序,它需要一个临时表。
但当我进一步了解数据集时,它会变慢(LIMIT 50000,25 为 6 秒)。你知道为什么吗?
要输出第 50000,25 行,MySQL 无论如何都需要获取前 50000 行并跳过它们。由于我错过了索引中的一列,MySQL 不仅扫描了索引,而且还对每个项目进行了额外的磁盘查找以获取log_code
值。使用覆盖索引应该会更快,因为所有数据都可以从索引中获取。
UPD2:尝试强制索引:
SELECT log_entries.date, log_codes.log_desc
FROM log_entries FORCE INDEX (IX_partner_code_date)
STRAIGHT_JOIN log_codes
ON log_codes.log_code = log_entries.log_code
WHERE log_entries.partner_id = 1
AND log_codes.category_overview = 1
ORDER BY log_entries.date DESC;
Run Code Online (Sandbox Code Playgroud)