MySQL子查询和临时表很慢

amq*_*amq 5 mysql subquery

我想优化以下查询:

SELECT SQL_NO_CACHE t.topic_id
FROM bb_topics t, bb_posters ps
WHERE t.topic_id = ps.topic_id
AND forum_id IN (2, 6, 7, 10, 15, 20)
ORDER BY ps.timestamp desc
LIMIT 20

Query took 0.1475 sec
Run Code Online (Sandbox Code Playgroud)

因此,起初我用INNER JOIN子查询替换了WHERE IN:

SELECT SQL_NO_CACHE t.topic_id
FROM ( SELECT * FROM bb_topics WHERE forum_id IN (2, 6, 7, 10, 15, 20) ) t
INNER JOIN bb_posters ps ON t.topic_id = ps.topic_id
ORDER BY ps.timestamp desc
LIMIT 20

Query took 0.1541 sec
Run Code Online (Sandbox Code Playgroud)

然后我尝试创建一个临时表:

CREATE TEMPORARY TABLE IF NOT EXISTS bb_topics_tmp ( INDEX(topic_id) )
ENGINE=MEMORY
AS ( SELECT * FROM bb_topics WHERE forum_id IN (2, 6, 7, 10, 15, 20) );

SELECT SQL_NO_CACHE t.topic_id
FROM bb_topics_tmp t, bb_posters ps
AND t.topic_id = ps.topic_id
ORDER BY ps.timestamp desc
LIMIT 20

Query took 0.1467 sec
Run Code Online (Sandbox Code Playgroud)

我不明白为什么从具有38,522行的完整表中进行选择比从具有9,943行的临时表中进行选择要快得多:

SELECT SQL_NO_CACHE t.topic_id
FROM bb_topics t, bb_posters ps
WHERE t.topic_id = ps.topic_id
ORDER BY ps.timestamp desc
LIMIT 20

Query took 0.0006 sec
Run Code Online (Sandbox Code Playgroud)

topic_id和timestamp都有索引。

有趣的是,即使使用这样的方法也比论坛列表要快得多:

AND pt.post_text LIKE '%searchterm%'
Run Code Online (Sandbox Code Playgroud)

UPD:

这是EXPLAIN的输出:

SELECT SQL_NO_CACHE t.topic_id, t.topic_title, ps.timestamp, u.username,
u.user_id, ps.size, ps.downloaded, ROUND(a.rating_sum/a.rating_count) AS Rating,
a.attach_id, pt.bbcode_uid, pt.post_text
FROM bb_topics t
JOIN bb_posters ps ON ps.topic_id = t.topic_id
LEFT JOIN bb_users u ON u.user_id = t.topic_poster
LEFT JOIN bb_posts_text pt ON pt.post_id = bt.post_id
LEFT JOIN bb_attachments_desc a ON bt.attach_id = a.attach_id
WHERE t.forum_id IN (2, 6, 7, 10, 15, 20)
ORDER BY ps.timestamp desc
LIMIT 1, 20

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  t   range   PRIMARY,forum_id    forum_id    2   NULL    8379    Using where; Using temporary; Using filesort
1   SIMPLE  ps  eq_ref  topic_id    topic_id    3   DB.t.topic_id       1    
1   SIMPLE  u   eq_ref  PRIMARY     PRIMARY     3   DB.t.topic_poster   1   Using index
1   SIMPLE  pt  eq_ref  PRIMARY     PRIMARY     3   DB.bt.post_id       1   Using index
1   SIMPLE  a   eq_ref  PRIMARY     PRIMARY     3   DB.bt.attach_id     1   Using index

Query took 0.8527 sec
Run Code Online (Sandbox Code Playgroud)

没有WHERE t.forum_id IN以下内容的相同查询:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  ps  index   topic_id    timestamp   4   NULL                21   
1   SIMPLE  t   eq_ref  PRIMARY     PRIMARY     3   DB.bt.topic_id      1    
1   SIMPLE  u   eq_ref  PRIMARY     PRIMARY     3   DB.t.topic_poster   1    
1   SIMPLE  pt  eq_ref  PRIMARY     PRIMARY     3   DB.bt.post_id       1    
1   SIMPLE  a   eq_ref  PRIMARY     PRIMARY     3   DB.bt.attach_id     1    

Query took 0.0022 sec
Run Code Online (Sandbox Code Playgroud)

UPD 2:

添加USE INDEX (timestamp)解决了这个问题:

SELECT SQL_NO_CACHE t.topic_id, t.topic_title, ps.timestamp, u.username,
u.user_id, ps.size, ps.downloaded, ROUND(a.rating_sum/a.rating_count) AS Rating,
a.attach_id, pt.bbcode_uid, pt.post_text
FROM bb_topics t
JOIN bb_posters ps USE INDEX (timestamp) ON ps.topic_id = t.topic_id
LEFT JOIN bb_users u ON u.user_id = t.topic_poster
LEFT JOIN bb_posts_text pt ON pt.post_id = bt.post_id
LEFT JOIN bb_attachments_desc a ON bt.attach_id = a.attach_id
WHERE t.forum_id IN (2, 6, 7, 10, 15, 20)
ORDER BY ps.timestamp desc
LIMIT 1, 20

Query took 0.0023 sec
Run Code Online (Sandbox Code Playgroud)

O. *_*nes 3

这些并不是非常困难的查询。通过使用 SQL_NO_CACHE 并计时它们,您正在做正确的事情。但你还需要看看 EXPLAIN 的结果。

使用 JOIN 语法而不是逗号分隔的表列表。查询应该是等效的,但旧式语法更难理解。

SELECT SQL_NO_CACHE 
       t.topic_id
  FROM bb_topics  AS t
  JOIN bb_posters AS ps ON t.topic_id = ps.topic_id
 WHERE t.forum_id IN (2, 6, 7, 10, 15, 20)
 ORDER BY ps.timestamp desc
 LIMIT 20
Run Code Online (Sandbox Code Playgroud)

尝试使用一些复合(多列)覆盖索引,使您的性能更上一层楼。

您需要按时间戳对 bb_posters 表进行排序,并且需要 topic_id。所以尝试这个索引: (timestamp, topic_id) 如果你可以使用像这样的语句

    WHERE ps.timestamp >= DATE(NOW()) - INTERVAL 7 DAY
Run Code Online (Sandbox Code Playgroud)

限制搜索的时间范围,这将对性能有更大的帮助。

您需要 bb_topics 表中的 topic_id 和 forum_id。所以试试这个索引(topic_id, forum_id)

您可以对您尝试连接的其他表使用类似的复合覆盖索引。

如果您的表索引良好,则对它们的查询应该与对临时表的查询一样高效。创建临时表往往会对服务器造成影响,例如刷新 RAM 中缓存的表数据,这会对性能产生意想不到的负面影响。