如何根据说明结果改进此查询

Bas*_*sel 7 php mysql sql performance

我有以下查询:

SELECT DISTINCT f1.match_static_id,
                f2.comments_no,
                f2.maxtimestamp,
                users.username,
                users.id,
                matches_of_comments.localteam_name,
                matches_of_comments.visitorteam_name,
                matches_of_comments.localteam_goals,       
                matches_of_comments.visitorteam_goals,
                matches_of_comments.match_status,
                new_iddaa.iddaa_code
FROM comments AS f1
INNER JOIN (
             SELECT match_static_id,
                    MAX( TIMESTAMP ) maxtimestamp,
                    COUNT( match_static_id ) AS comments_no
             FROM comments
             GROUP BY match_static_id
          ) AS f2 ON f1.match_static_id = f2.match_static_id 
                  AND f1.timestamp = f2.maxtimestamp
INNER JOIN users ON users.id = f1.user_id
INNER JOIN matches_of_comments ON matches_of_comments.match_id = f2.match_static_id
LEFT JOIN new_iddaa ON new_iddaa.match_id = matches_of_comments.match_id
WHERE matches_of_comments.flag =1
ORDER BY f2.maxtimestamp DESC
Run Code Online (Sandbox Code Playgroud)

这是该查询的EXPLAIN计划:

+----+-------------+---------------------+--------+-----------------------------------+-----------+---------+------------------------------------------+-------+------------------------------------------------+
| id | select_type |        table        |  type  |           possible_keys           |    key    | key_len |                   ref                    | rows  |                     extra                      |
+----+-------------+---------------------+--------+-----------------------------------+-----------+---------+------------------------------------------+-------+------------------------------------------------+
|  1 | PRIMARY     | <derived2>          | ALL    | NULL                              | NULL      | NULL    | NULL                                     |   542 | Using temporary; Using filesort                |
|  1 | PRIMARY     | f1                  | ref    | timestamp,match_static_id,user_id | timestamp | 4       | f2.maxtimestamp                          |     1 | Using where                                    |
|  1 | PRIMARY     | users               | eq_ref | PRIMARY                           | PRIMARY   | 4       | skormix_db1.f1.user_id                   |     1 |                                                |
|  1 | PRIMARY     | matches_of_comments | ALL    | match_id                          | NULL      | NULL    | NULL                                     | 20873 | Range checked for each record (index map: 0x8) |
|  1 | PRIMARY     | new_iddaa           | ref    | match_id                          | match_id  | 4       | skormix_db1.matches_of_comments.match_id |     1 |                                                |
|  2 | DERIVED     | comments            | ALL    | NULL                              | NULL      | NULL    | NULL                                     |   933 | Using temporary; Using filesort                |
+----+-------------+---------------------+--------+-----------------------------------+-----------+---------+------------------------------------------+-------+------------------------------------------------+
Run Code Online (Sandbox Code Playgroud)

如果此匹配至少有一条评论,我会使用此查询来获取匹配信息.
我得到了团队的名字,代码(iddaa代码),评论数量,最后一个commrnt的timstamp,最后一条评论的作者.
我有一个大型数据库,预计在接下来的几个月内会更大,我对MySQL查询非常新,我想确保我从一开始就使用优化查询,所以我想知道如何阅读这解释了使查询更好,更快的信息.

我看到表中有很多地方虽然我建了它们,但它们并没有使用索引.
我也看到在表列中派生,我不知道如何使这个查询更快,以及如何摆脱filesort因为我不能为派生查询制作索引?

我用索引(键)写下查询中使用表的结构,我希望能提前得到一些提示或简单的答案.

注释(f1)表结构是:

CREATE TABLE `comments` (
 `id` int(25) NOT NULL AUTO_INCREMENT,
 `comments` text COLLATE utf8_unicode_ci NOT NULL,
 `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 `date` date NOT NULL,
 `time` time NOT NULL,
 `match_static_id` int(25) NOT NULL,
 `ip` varchar(255) CHARACTER SET latin1 NOT NULL,
 `comments_yes_or_no` int(25) NOT NULL,
 `user_id` int(25) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `timestamp` (`timestamp`),
 KEY `match_static_id` (`match_static_id`),
 KEY `user_id` (`user_id`)
) ENGINE=MyISAM AUTO_INCREMENT=935 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Run Code Online (Sandbox Code Playgroud)

用户表结构是:

CREATE TABLE `users` (
 `id` int(25) NOT NULL AUTO_INCREMENT,
 `username` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `password` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `email` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `gender` int(25) NOT NULL,
 `first_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `last_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `avatar` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `alert` int(25) NOT NULL,
 `daily_tahmin` int(25) NOT NULL,
 `monthly_tahmin` int(25) NOT NULL,
 `admin` int(25) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=995 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Run Code Online (Sandbox Code Playgroud)

matches_of_comments_结构是:

CREATE TABLE `matches_of_comments` (
 `id` int(25) NOT NULL AUTO_INCREMENT,
 `en_tournament_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `tournament_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `country_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `match_status` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `match_time` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `match_date` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `static_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `fix_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `match_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `localteam_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `localteam_goals` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `localteam_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `visitorteam_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `visitorteam_goals` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `visitorteam_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `ht_score` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
 `flag` int(25) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `match_status` (`match_status`),
 KEY `match_date` (`match_date`),
 KEY `match_id` (`match_id`),
 KEY `localteam_id` (`localteam_id`),
 KEY `visitorteam_id` (`visitorteam_id`),
 KEY `flag` (`flag`)
) ENGINE=MyISAM AUTO_INCREMENT=237790 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Run Code Online (Sandbox Code Playgroud)

new_iddaa表结构是:

CREATE TABLE `new_iddaa` (
 `id` int(25) NOT NULL AUTO_INCREMENT,
 `match_id` int(25) NOT NULL,
 `iddaa_code` int(25) NOT NULL,
 `tv_channel` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
 `skormix_tahmin` varchar(255) CHARACTER SET utf8 NOT NULL,
 PRIMARY KEY (`id`),
 KEY `match_id` (`match_id`)
) ENGINE=MyISAM AUTO_INCREMENT=8191 DEFAULT CHARSET=latin1
Run Code Online (Sandbox Code Playgroud)

Den*_*rdy 1

在讨论选项之前,先从更紧迫的问题开始。

\n\n
\n\n

第一个迫在眉睫的问题是:

\n\n
SELECT DISTINCT \xe2\x80\xa6\n
Run Code Online (Sandbox Code Playgroud)\n\n

Aselect distinct很慢。非常非常慢:它基本上比较您的集合返回的每行的每个字段。ID当其中有一个保证每行都是唯一的时,自然有优化的空间,但您自己的查询看起来并没有提供任何这样的可能性:最多是来自matches_of_comments和 的元组new_iddaa元组。

\n\n

要解决此问题,请将查询分成两个或更多部分,并仅获取您正在执行的操作实际需要的内容。这似乎是matches_of_comments按他们的最新评论日期排序,然后从users和获取额外的化妆品数据new_iddaa获取额外的化妆品数据。

\n\n

恕我直言,下一个是最大的问题:

\n\n
INNER JOIN (\n         SELECT match_static_id,\n                MAX( TIMESTAMP ) maxtimestamp,\n                COUNT( match_static_id ) AS comments_no\n         FROM comments\n         GROUP BY match_static_id\n      ) AS f2 ON f1.match_static_id = f2.match_static_id \n              AND f1.timestamp = f2.maxtimestamp\n
Run Code Online (Sandbox Code Playgroud)\n\n

您正在将一个聚合与一个表连接起来(match_static_id, timestamp)没有索引的元组上的表连接起来,并从中获取一个巨大的集合。您有保证的合并连接 \xe2\x80\x94 不是您想要的。

\n\n

最后一个令人瞠目结舌的问题是:

\n\n
ORDER BY f2.maxtimestamp DESC\n
Run Code Online (Sandbox Code Playgroud)\n\n

首先,你没有任何限制。这意味着您将构建、排序并返回一个巨大的集合。当然,您正在对这些数据进行分页,因此可以通过添加限制子句在查询中进行分页。

\n\n

添加限制后,您需要考虑添加额外行的内容以及应如何对它们进行排序。根据你的架构,我想new_iddaa是的。您是否以这样的方式对事物进行分页:后面的信息需要成为该查询及其返回的行数的一部分?我想不会,因为您显然对这些行的排序方式不感兴趣。

\n\n

扫描您的架构后,会弹出以下附加内容:

\n\n
`match_id` varchar(255)\n
Run Code Online (Sandbox Code Playgroud)\n\n

引用它的行是整数,对吗?因此它也应该是一个整数,以避免将 varchar 转换为 int 或反之亦然的开销,并允许在任何一种情况下使用索引。

\n\n

虽然与此特定查询无关,但以下两个字段也需要注意并进行正确的转换:

\n\n
`tournament_id` varchar(255)\n`match_time` varchar(255)\n`match_date` varchar(255)\n`static_id` varchar(255)\n`fix_id` varchar(255)\n`localteam_id` varchar(255)\n`visitorteam_id` varchar(255)\n
Run Code Online (Sandbox Code Playgroud)\n\n
\n\n

改进查询\xe2\x80\xa6

\n\n

当我读到它时,您是matches_of_comments按最新评论订购的。您还需要评论数量,所以我们首先这样做。假设您要对前 10 个分页,查询将如下所示:

\n\n
SELECT match_static_id,\n       MAX( TIMESTAMP ) maxtimestamp,\n       COUNT( match_static_id ) AS comments_no\nFROM comments\nGROUP BY match_static_id\nORDER BY maxtimestamp DESC\nLIMIT 10 OFFSET 0\n
Run Code Online (Sandbox Code Playgroud)\n\n

就这样。

\n\n

如果您增加限制,它会为您提供更多 10 个 ID \xe2\x80\x94。在您的应用程序中循环遍历它们并构建一个in (\xe2\x80\xa6)子句,该子句将允许您根据需要从其他表中获取每个单独的数据位;您可以通过一个或多个查询来完成此操作,这没什么关系。重点是避免加入该聚合,以便索引可用于后续查询。

\n\n
\n\n

通过完全删除上述查询,您可以更显着地改进事情。

\n\n

为此,请向 中添加三个字段matches_of_comments,即last_comment_timestamplast_comment_user_idnum_comments。使用触发器维护它们,并在 上添加索引(flag, last_comment_timestamp)。这将允许您运行以下有效的查询:

\n\n
SELECT matches_of_comments.static_id,\n       matches_of_comments.num_comments,\n       matches_of_comments.last_comment_timestamp,\n       matches_of_comments.last_comment_user_id,\n       matches_of_comments.localteam_name,\n       matches_of_comments.visitorteam_name,\n       matches_of_comments.localteam_goals,       \n       matches_of_comments.visitorteam_goals,\n       matches_of_comments.match_status\nFROM matches_of_comments\nWHERE matches_of_comments.flag = 1\nORDER BY matches_of_comments.last_comment_timestamp DESC\nLIMIT 10 OFFSET 0\n
Run Code Online (Sandbox Code Playgroud)\n\n

然后,您只需使用带有子句的单独查询(如前所述)从 \ usersxe2 new_iddaa\x80\x94中选择所需的数据。in (\xe2\x80\xa6)

\n