Mat*_*ong 9 mysql query-optimization
We have noticed some inconsistencies in our MySQL performance for query times that we feel cannot be explained by just server load. Some queries seem to be much efficient than others, despite having a similar setup.
Edit: Since opening this question, our database crashed (for unknown reasons at this moment, RDS teams are investigating), and after a reboot the issue is no longer reproducible and the queries are the same speed. I still would like to know what was wrong because the issue may come back, but perhaps we will never find out...
Continue original question: To test this, I made a near-copy of a table. The performance (query time) against the copied table is dramatically worse than the source, despite the same execution path, same data, and very similar table schematics.
In this example, we have the following 3 tables:
offers_clicks
offers
offers_new
"offers_new" is a test table that has a limited number of field (only "id" and 'status" fields), but is otherwise having the same structure as "offers" table (which is also having "id" and "status" field, but also many other fields). Here are the table schematics:
Offers Table:
| offers | CREATE TABLE `offers` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`status` tinyint(4) NOT NULL DEFAULT '0',
`user_id` int(11) NOT NULL DEFAULT '0',
.....many_other_fields.....
KEY `ix_public_view_key` (`public_view_key`),
FULLTEXT KEY `name` (`name`,`internal_name`)
) ENGINE=InnoDB AUTO_INCREMENT=18425582 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
Run Code Online (Sandbox Code Playgroud)
Offers New table:
| offers_new | CREATE TABLE `offers_new` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`status` tinyint(4) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=18423831 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
Run Code Online (Sandbox Code Playgroud)
We then did an INSERT SELECT from "offers" table to "offers_new" table:
INSERT INTO offers_new (id, status) SELECT id, status FROM offers;
After this, we started to run some test queries. As you can see, the query times to "offers_new" table is about 10 times slower than "offers" table.
mysql> SELECT COUNT(*) FROM offers_clicks, offers_new WHERE offers_new.id = offers_clicks.offer_id AND offers_clicks.date > '2019-05-30 00:00:00' and offers_clicks.date < '2019-05-30 01:00:00';
+----------+
| COUNT(*) |
+----------+
| 15472 |
+----------+
1 row in set (26.04 sec)
mysql> SELECT COUNT(*) FROM offers_clicks, offers WHERE offers.id = offers_clicks.offer_id AND offers_clicks.date > '2019-05-30 00:00:00' and offers_clicks.date < '2019-05-30 01:00:00';
+----------+
| COUNT(*) |
+----------+
| 15472 |
+----------+
1 row in set (2.90 sec)
mysql> SELECT COUNT(*) FROM offers_clicks, offers_new WHERE offers_new.id = offers_clicks.offer_id AND offers_clicks.date > '2019-05-30 00:00:00' and offers_clicks.date < '2019-05-30 01:00:00';
+----------+
| COUNT(*) |
+----------+
| 15472 |
+----------+
1 row in set (28.07 sec)
mysql> SELECT COUNT(*) FROM offers_clicks, offers WHERE offers.id = offers_clicks.offer_id AND offers_clicks.date > '2019-05-30 00:00:00' and offers_clicks.date < '2019-05-30 01:00:00';
+----------+
| COUNT(*) |
+----------+
| 15472 |
+----------+
1 row in set (2.26 sec)
Run Code Online (Sandbox Code Playgroud)
Please note that the execution path is exactly the same for both queries:
mysql> explain SELECT COUNT(*) FROM offers_clicks, offers_new WHERE offers_new.id = offers_clicks.offer_id AND offers_clicks.date > '2019-05-30 00:00:00' and offers_clicks.date < '2019-05-30 01:00:00';
+----+-------------+---------------+-----------------------------------------------------------------------+--------+--------------------+---------+---------+--------------------------------------------+-------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+
| 1 | SIMPLE | offers_clicks | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19 | range | pts_id,date,date_2 | date | 5 | NULL | 15472 | 100.00 | Using index condition |
| 1 | SIMPLE | offers_new | NULL | eq_ref | PRIMARY | PRIMARY | 4 | dejong_pointstoshop.offers_clicks.offer_id | 1 | 100.00 | Using index |
+----+
2 rows in set, 1 warning (0.00 sec)
mysql> explain SELECT COUNT(*) FROM offers_clicks, offers WHERE offers.id = offers_clicks.offer_id AND offers_clicks.date > '2019-05-30 00:00:00' and offers_clicks.date < '2019-05-30 01:00:00';
+----+-------------+---------------+-----------------------------------------------------------------------+--------+--------------------+---------+---------+--------------------------------------------+-------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------------+-----------------------------------------------------------------------+--------+--------------------+---------+---------+--------------------------------------------+-------+----------+-----------------------+
| 1 | SIMPLE | offers_clicks | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19 | range | pts_id,date,date_2 | date | 5 | NULL | 15472 | 100.00 | Using index condition |
| 1 | SIMPLE | offers | NULL | eq_ref | PRIMARY | PRIMARY | 4 | dejong_pointstoshop.offers_clicks.offer_id | 1 | 100.00 | Using index |
+----+-------------+---------------+-----------------------------------------------------------------------+--------+--------------------+---------+---------+--------------------------------------------+-------+----------+-----------------------+
2 rows in set, 1 warning (0.00 sec)
Run Code Online (Sandbox Code Playgroud)
Steps we have taken after this:
We are using MySQL 8.0.15
It is worth noting that our production system is under high CPU load. However, this should not explain different query times for queries that are pretty much identical. We ran it dozens of times now with he same results.
It is also worth noting that "offers" table is a production table, so it's being updated / queried against frequently. "offers_new", which is the slow table, is not.
Edits based on comments:
mysql> SELECT COUNT(*) FROM offers_new;
+----------+
| COUNT(*) |
+----------+
| 5093127 |
+----------+
1 row in set (0.13 sec)
mysql> SELECT COUNT(*) FROM offers;
+----------+
| COUNT(*) |
+----------+
| 5107742 |
+----------+
1 row in set (2.54 sec)
Run Code Online (Sandbox Code Playgroud)
The counts are very close. Offers is a live table, so since my tests yesterday, new entries have been added to the table. The counts are still close though. Interestingly, the COUNT(*) on "offers_new" is significantly faster than "offers", so the opposite from the query results we made this question about! Count on "offers_new" is about 0.1/0.2 seconds, count on "offers" table is a few seconds (ranges between 2-6s). I ran it about 10 times each to make sure it wasn't just server load, and it was consistently slower. I assume that this has something to do with the number of columns which is much higher for "offers" table. Either way, it's interesting to see the opposite results of the other query...
Here is the profiling:
mysql> SHOW PROFILES;
+----------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query_ID | Duration | Query |
+---+
| 1 | 26.03997750 | SELECT COUNT(*) FROM offers_clicks, offers_new WHERE offers_new.id = offers_clicks.offer_id AND offers_clicks.date > '2019-05-30 00:00:00' and offers_clicks.date < '2019-05-30 01:00:00' |
| 2 | 2.89890600 | SELECT COUNT(*) FROM offers_clicks, offers WHERE offers.id = offers_clicks.offer_id AND offers_clicks.date > '2019-05-30 00:00:00' and offers_clicks.date < '2019-05-30 01:00:00' |
| 3 | 28.07228225 | SELECT COUNT(*) FROM offers_clicks, offers_new WHERE offers_new.id = offers_clicks.offer_id AND offers_clicks.date > '2019-05-30 00:00:00' and offers_clicks.date < '2019-05-30 01:00:00' |
| 4 | 2.25160675 | SELECT COUNT(*) FROM offers_clicks, offers WHERE offers.id = offers_clicks.offer_id AND offers_clicks.date > '2019-05-30 00:00:00' and offers_clicks.date < '2019-05-30 01:00:00' |
+----------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4 rows in set, 1 warning (0.00 sec)
mysql> SHOW PROFILE ALL FOR QUERY 1;
+---+
| Status | Duration | CPU_user | CPU_system | Context_voluntary | Context_involuntary | Block_ops_in | Block_ops_out | Messages_sent | Messages_received | Page_faults_major | Page_faults_minor | Swaps | Source_function | Source_file | Source_line |
+---+
| starting | 0.000364 | 0.016000 | 0.000000 | 3 | 3 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | NULL | NULL | NULL |
| Executing hook on transaction | 0.000134 | 0.008000 | 0.000000 | 6 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | launch_hook_trans_begin | rpl_handler.cc | 1100 |
| starting | 0.000128 | 0.008000 | 0.000000 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | launch_hook_trans_begin | rpl_handler.cc | 1102 |
| checking permissions | 0.000173 | 0.008000 | 0.000000 | 3 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | check_access | sql_authorization.cc | 1899 |
| checking permissions | 0.000182 | 0.012000 | 0.000000 | 65 | 33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | check_access | sql_authorization.cc | 1899 |
| Opening tables | 0.003432 | 0.176000 | 0.012000 | 556 | 224 | 0 | 0 | 0 | 0 | 0 | 20 | 0 | open_tables | sql_base.cc | 5586 |
| init | 0.000235 | 0.012000 | 0.000000 | 25 | 20 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | execute | sql_select.cc | 555 |
| System lock | 0.000151 | 0.008000 | 0.000000 | 62 | 17 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_lock_tables | lock.cc | 332 |
| optimizing | 0.000171 | 0.008000 | 0.000000 | 32 | 9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | optimize | sql_optimizer.cc | 212 |
| statistics | 0.001255 | 0.068000 | 0.000000 | 230 | 125 | 0 | 24 | 0 | 0 | 0 | 0 | 0 | optimize | sql_optimizer.cc | 425 |
| preparing | 0.000166 | 0.012000 | 0.000000 | 43 | 19 | 0 | 0 | 0 | 0 | 0 | 10 | 0 | optimize | sql_optimizer.cc | 499 |
| executing | 0.000134 | 0.004000 | 0.000000 | 27 | 16 | 0 | 0 | 0 | 0 | 0 | 10 | 0 | exec | sql_executor.cc | 197 |
| Sending data | 26.032492 | 999.999999 | 67.940000 | 5361975 | 1420548 | 227416 | 1584568 | 0 | 0 | 0 | 299459 | 0 | exec | sql_executor.cc | 273 |
| end | 0.000325 | 0.012000 | 0.000000 | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | execute | sql_select.cc | 608 |
| query end | 0.000115 | 0.004000 | 0.000000 | 11 | 9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_execute_command | sql_parse.cc | 4581 |
| waiting for handler commit | 0.000118 | 0.008000 | 0.000000 | 34 | 13 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ha_commit_trans | handler.cc | 1533 |
| closing tables | 0.000118 | 0.004000 | 0.000000 | 23 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_execute_command | sql_parse.cc | 4627 |
| freeing items | 0.000163 | 0.008000 | 0.000000 | 4 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_parse | sql_parse.cc | 5256 |
| cleaning up | 0.000125 | 0.008000 | 0.000000 | 5 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | dispatch_command | sql_parse.cc | 2108 |
+---+
19 rows in set, 1 warning (0.00 sec)
mysql> SHOW PROFILE ALL FOR QUERY 2;
+---+
| Status | Duration | CPU_user | CPU_system | Context_voluntary | Context_involuntary | Block_ops_in | Block_ops_out | Messages_sent | Messages_received | Page_faults_major | Page_faults_minor | Swaps | Source_function | Source_file | Source_line |
+---+
| starting | 0.000364 | 0.012000 | 0.000000 | 41 | 24 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | NULL | NULL | NULL |
| Executing hook on transaction | 0.000137 | 0.008000 | 0.000000 | 40 | 12 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | launch_hook_trans_begin | rpl_handler.cc | 1100 |
| starting | 0.000135 | 0.004000 | 0.000000 | 21 | 6 | 0 | 16 | 0 | 0 | 0 | 2 | 0 | launch_hook_trans_begin | rpl_handler.cc | 1102 |
| checking permissions | 0.000124 | 0.008000 | 0.000000 | 26 | 7 | 0 | 16 | 0 | 0 | 0 | 2 | 0 | check_access | sql_authorization.cc | 1899 |
| checking permissions | 0.000139 | 0.008000 | 0.000000 | 19 | 9 | 32 | 24 | 0 | 0 | 0 | 4 | 0 | check_access | sql_authorization.cc | 1899 |
| Opening tables | 0.000152 | 0.004000 | 0.000000 | 30 | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | open_tables | sql_base.cc | 5586 |
| init | 0.000125 | 0.004000 | 0.008000 | 25 | 19 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | execute | sql_select.cc | 555 |
| System lock | 0.000237 | 0.004000 | 0.004000 | 26 | 15 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | mysql_lock_tables | lock.cc | 332 |
| optimizing | 0.000150 | 0.008000 | 0.000000 | 24 | 7 | 0 | 0 | 0 | 0 | 0 | 4 | 0 | optimize | sql_optim
一个猜测(等待评论中要求的更多信息)...
innodb_buffer_pool_size小于Data_length(参见SHOW TABLE SIZE offers),并且15K 行分散在表周围。这可能会导致命中磁盘而不是id在缓存(buffer_pool)中找到所需的内容。部分解决方案可能是增加 buffer_pool。但是,由于您使用的是云服务,因此增加内存的唯一方法可能是购买更大的 RAM。
另一种方法可能是强制或欺骗它使用不同的索引。但首先我需要查看其余索引,以及索引列的数据类型。
同时,我很想批评一下offers_clicks;分区键是什么?
另一种方法...这是一个常见的查询吗?好像是“一小时内发生了多少次点击?”
为什么需要看offers呢?难道它不能得到正确的答案SELECT COUNT(*) FROM offers_clicks WHERE date ...吗?
如果您确实需要JOIN,那么我们来谈谈构建和维护一个保存每小时计数的汇总表。
附带问题
FROM offers_clicks, offers
WHERE offers.id = offers_clicks.offer_id AND offers_clicks.date
Run Code Online (Sandbox Code Playgroud)
应该写成
FROM offers_clicks
JOIN offers ON offers.id = offers_clicks.offer_id -- how the tables relate
WHERE offers_clicks.date ...
Run Code Online (Sandbox Code Playgroud)
前者是旧的“commalist”风格JOIN,后者是首选语法。这两种语法生成相同的代码,因此这不是性能问题。前者看起来像“交叉连接”,但WHERE实际上并非如此。
而且……这只是三千五百九十九秒,并不是整整一个小时。大会将包括第一个午夜。
date > '2019-05-30 00:00:00'
and date < '2019-05-30 01:00:00'
Run Code Online (Sandbox Code Playgroud)
我更喜欢这种风格,因为它避免了闰年问题等:
date >= '2019-05-30 00:00:00'
and date < '2019-05-30 00:00:00' + INTERVAL 1 HOUR
Run Code Online (Sandbox Code Playgroud)
页面错误...
当程序(MySQL 或任何其他程序)分配的内存 (RAM) 多于操作系统愿意提供的内存时,某些页面(通常大小为 4KB)将被“交换”到磁盘。当程序引用某个页面时,就会发生“页面错误”。这是一个硬件中断,表示“紧急!RAM 中没有您请求的页面”。然后操作系统介入将该页面复制回 RAM,以便程序可以继续运行。
InnoDB 更愿意分配大量 RAM,然后在需要时可以自由访问所有内存。也就是说,InnoDB 假装页面错误永远不会发生。因此,页面错误会严重损害性能。不要将 MySQL 配置为使用“太多”RAM,否则会出现页面错误。
InnoDB确实需要将数据块和索引(以及表定义等)交换到 RAM 中,并为其他块释放空间。但 InnoDB 经过精心设计和编码,可以最大限度地减少这种情况。“缓冲池”是一大块 RAM,InnoDB 将其用作 16KB 块的 LRU(最近最少使用)缓存。 这可能是导致处理速度变慢的原因,而不是CPU,也不是页面错误等。(我正在等待一些信息,以便我可以确认这一点。)
第一次通常较慢,此时显示主要页面错误。后续查询通常很快。
这意味着存在 I/O 来引入查询所需的页面(主要页面错误)。然后后续查询可以在没有 I/O 的情况下运行(更快)。
而且,由于offers更大,因此会有更多的 I/O。然而,这与您所看到的相反。