Den*_* G. 37 mysql performance select
我有一个包含大约100.000个博客帖子的表格,通过1:n关系链接到包含50个Feed的表格.当我使用selects语句查询两个表时,由postss表的datetime字段排序,MySQL总是使用filesort,导致查询时间非常慢(> 1秒).这是postings表格的架构(简化):
+---------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| feed_id | int(11) | NO | MUL | NULL | |
| crawl_date | datetime | NO | | NULL | |
| is_active | tinyint(1) | NO | MUL | 0 | |
| link | varchar(255) | NO | MUL | NULL | |
| author | varchar(255) | NO | | NULL | |
| title | varchar(255) | NO | | NULL | |
| excerpt | text | NO | | NULL | |
| long_excerpt | text | NO | | NULL | |
| user_offtopic_count | int(11) | NO | MUL | 0 | |
+---------------------+--------------+------+-----+---------+----------------+
Run Code Online (Sandbox Code Playgroud)
这是feed表格:
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| type | int(11) | NO | MUL | 0 | |
| title | varchar(255) | NO | | NULL | |
| website | varchar(255) | NO | | NULL | |
| url | varchar(255) | NO | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
Run Code Online (Sandbox Code Playgroud)
这是执行时间> 1秒的查询.请注意,该post_date字段有一个索引,但MySQL不使用它来对postss表进行排序:
SELECT
`postings`.`id`,
UNIX_TIMESTAMP(postings.post_date) as post_date,
`postings`.`link`,
`postings`.`title`,
`postings`.`author`,
`postings`.`excerpt`,
`postings`.`long_excerpt`,
`feeds`.`title` AS feed_title,
`feeds`.`website` AS feed_website
FROM
(`postings`)
JOIN
`feeds`
ON
`feeds`.`id` = `postings`.`feed_id`
WHERE
`feeds`.`type` = 1 AND
`postings`.`user_offtopic_count` < 10 AND
`postings`.`is_active` = 1
ORDER BY
`postings`.`post_date` desc
LIMIT
15
Run Code Online (Sandbox Code Playgroud)
explain extended此查询的命令结果显示MySQL正在使用filesort:
+----+-------------+----------+--------+---------------------------------------+-----------+---------+--------------------------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+--------+---------------------------------------+-----------+---------+--------------------------+-------+-----------------------------+
| 1 | SIMPLE | postings | ref | feed_id,is_active,user_offtopic_count | is_active | 1 | const | 30996 | Using where; Using filesort |
| 1 | SIMPLE | feeds | eq_ref | PRIMARY,type | PRIMARY | 4 | feedian.postings.feed_id | 1 | Using where |
+----+-------------+----------+--------+---------------------------------------+-----------+---------+--------------------------+-------+-----------------------------+
Run Code Online (Sandbox Code Playgroud)
当我删除该order by部分时,MySQL停止使用filesort.如果您对如何优化此查询以使MySQL按照索引排序和选择数据有任何想法,请告诉我.我已经尝试了一些事情,比如在所有where/order by fields上创建一个组合索引,正如一些博客帖子所建议的那样,但这也没有用.
Qua*_*noi 39
创建一个复合索引postings (is_active, post_date)(按此顺序).
它将用于过滤is_active和排序post_date.
MySQL应该REF在这个索引上显示访问方法EXPLAIN EXTENDED.
请注意,您有一个RANGE过滤条件user_offtopic_count,这就是为什么您不能在过滤和其他字段排序中使用此字段的索引.
根据您的选择性user_offtopic_count(即满足多少行user_offtopic_count < 10),创建索引user_offtopic_count并让post_dates进行排序可能更有用.
为此,请创建复合索引postings (is_active, user_offtopic_count)并确保RANGE使用此索引上的访问方法.
哪个索引更快取决于您的数据分配.创建两个索引,FORCE看看哪个更快:
CREATE INDEX ix_active_offtopic ON postings (is_active, user_offtopic_count);
CREATE INDEX ix_active_date ON postings (is_active, post_date);
SELECT
`postings`.`id`,
UNIX_TIMESTAMP(postings.post_date) as post_date,
`postings`.`link`,
`postings`.`title`,
`postings`.`author`,
`postings`.`excerpt`,
`postings`.`long_excerpt`,
`feeds`.`title` AS feed_title,
`feeds`.`website` AS feed_website
FROM
`postings` FORCE INDEX (ix_active_offtopic)
JOIN
`feeds`
ON
`feeds`.`id` = `postings`.`feed_id`
WHERE
`feeds`.`type` = 1 AND
`postings`.`user_offtopic_count` < 10 AND
`postings`.`is_active` = 1
ORDER BY
`postings`.`post_date` desc
LIMIT
15
/* This should show RANGE access with few rows and keep the FILESORT */
SELECT
`postings`.`id`,
UNIX_TIMESTAMP(postings.post_date) as post_date,
`postings`.`link`,
`postings`.`title`,
`postings`.`author`,
`postings`.`excerpt`,
`postings`.`long_excerpt`,
`feeds`.`title` AS feed_title,
`feeds`.`website` AS feed_website
FROM
`postings` FORCE INDEX (ix_active_date)
JOIN
`feeds`
ON
`feeds`.`id` = `postings`.`feed_id`
WHERE
`feeds`.`type` = 1 AND
`postings`.`user_offtopic_count` < 10 AND
`postings`.`is_active` = 1
ORDER BY
`postings`.`post_date` desc
LIMIT
15
/* This should show REF access with lots of rows and no FILESORT */
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
31153 次 |
| 最近记录: |