dre*_*010 6 mysql performance subquery
我正在尝试从历史表中为多个设备(唯一序列号)选择一系列数据,并想知道为什么以下查询的时间差异如此之大:
基本上我试图使用 IN 子句来指示我想要获取数据的项目。如果我对 IN 子句中的项目进行“硬编码”,则查询速度很快,如果我使用子查询或连接来选择项目,则性能很差。
此查询在 0.15 秒内完成并返回 7382 行。
SELECT `readings`.* FROM `readings`
WHERE
(SerialNumber IN ('091146000121', *snip 25*, '091146000556'))
AND (readings.time >= 1325404800)
AND (readings.time < 1326317400)
ORDER BY `time` ASC
Run Code Online (Sandbox Code Playgroud)
使用子查询重写以获取序列号的相同查询需要 30 多秒,并且似乎大部分时间都处于 Preparing 状态。它返回与第一个查询相同的数据。
SELECT `readings`.* FROM `readings`
WHERE
(SerialNumber IN (SELECT `boards`.`id` AS `SerialNumber` FROM `boards` WHERE (siteId = '1')))
AND (readings.time >= 1325404800)
AND (readings.time < 1326317400)
ORDER BY `time` ASC
Run Code Online (Sandbox Code Playgroud)
子查询返回与第一个查询中相同的值,但如前所述,这需要更长的时间来运行。 它们在功能上不是等效的吗?
这是两个查询的解释:
+----+-------------+----------+-------+---------------+---------+---------+------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------ +------+-----------------------------+
| 1 | SIMPLE | readings | range | PRIMARY,time | PRIMARY | 22 | NULL | 7339 | Using where; Using filesort |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-----------------------------+
+----+--------------------+----------+-----------------+----------------+---------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+-----------------+----------------+---------+---------+------+---------+-------------+
| 1 | PRIMARY | readings | range | time | time | 4 | NULL | 6353234 | Using where |
| 2 | DEPENDENT SUBQUERY | boards | unique_subquery | PRIMARY,siteId | PRIMARY | 18 | func | 1 | Using where |
+----+--------------------+----------+-----------------+----------------+---------+---------+------+---------+-------------+
Run Code Online (Sandbox Code Playgroud)
出于某种原因,带有子选择的查询没有使用主键。我尝试使用 USE INDEX,但这实际上使它花费了更长的时间。
读数表具有 PRIMARY KEY SerialNumber,带有时间索引的时间。
板表具有主键 ID(序列号)和 siteId 上的索引。
我使用的 MySQL 版本是 5.5.8-log MySQL Community Server (GPL)
我只是想知道为什么两个查询的性能不是很相似。谢谢。
更新:以下是创建表语句:
mysql> SHOW CREATE TABLE readings\G
*************************** 1. row ***************************
Table: readings
Create Table: CREATE TABLE `readings` (
`time` int(11) NOT NULL,
`boxsn` varchar(16) NOT NULL,
`rev` varchar(16) NOT NULL,
`schema` tinyint(3) unsigned NOT NULL,
`interval` smallint(5) unsigned NOT NULL,
`relay` tinyint(4) NOT NULL,
`inputV` decimal(10,6) NOT NULL,
`inputA` decimal(10,6) NOT NULL,
`outputV` decimal(10,6) NOT NULL,
`outputA` decimal(10,6) NOT NULL,
`phase` tinyint(4) NOT NULL,
`outputVA` decimal(10,6) NOT NULL,
`watts` decimal(10,6) NOT NULL DEFAULT '0.000000',
`var` decimal(10,6) NOT NULL,
`kiloVAHours` decimal(9,9) DEFAULT '0.000000000',
`kilowattHours` decimal(9,9) NOT NULL,
`kilovarHours` decimal(9,9) NOT NULL,
PRIMARY KEY (`boxsn`,`time`),
KEY `time` (`time`),
KEY `boxsn_time_ndx` (`boxsn`,`time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
mysql> SHOW CREATE TABLE boards\G
*************************** 1. row ***************************
Table: boards
Create Table: CREATE TABLE `boards` (
`id` varchar(16) NOT NULL,
`siteId` int(11) NOT NULL,
`groupId` int(11) DEFAULT '0',
`lastReport` int(11) DEFAULT NULL,
`lastIp` varchar(15) DEFAULT '0.0.0.0',
`label` varchar(24) DEFAULT '',
PRIMARY KEY (`id`),
KEY `siteId` (`siteId`),
KEY `siteId_id_ndx` (`siteId`,`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=DYNAMIC
Run Code Online (Sandbox Code Playgroud)
重构查询如下:
SELECT
readings.*
FROM
(
SELECT boxsn FROM readings
WHERE (time >= 1325404800)
AND (time < 1326317400)
ORDER BY `time` ASC
) readings_keys
LEFT JOIN
(
SELECT id AS boxsn FROM boards WHERE siteId = '1'
) boards
USING (boxsn)
LEFT JOIN readings
USING (boxsn)
;
Run Code Online (Sandbox Code Playgroud)
确保您有以下索引:
ALTER TABLE boards ADD INDEX siteId_id_ndx (siteId,id);
ALTER TABLE readings ADD INDEX time_boxsn_ndx (time,boxsn);
Run Code Online (Sandbox Code Playgroud)
您可以删除其他索引
ALTER TABLE readings DROP INDEX boxsn_time_ndx;
Run Code Online (Sandbox Code Playgroud)
随着表的增长,您肯定会看到性能的显着提高。
在你的情况下,
readings针对内存中的值列表为每一行执行 SerialNumber 的查找readings针对表中的每一行执行 SerialNumber 的查找。我再次重构它以确保在从表中检索数据之前正确组合readings键和boards键readings:
SELECT
readings.*
FROM
(
SELECT A.* FROM
(
SELECT boxsn FROM readings
WHERE (time >= 1325404800)
AND (time < 1326317400)
ORDER BY `time` ASC
) A
LEFT JOIN
(
SELECT id AS boxsn
FROM boards
WHERE siteId = '1'
) B
USING (boxsn)
WHERE B.boxsn IS NOT NULL
) readings_keys
LEFT JOIN readings
USING (boxsn)
;
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
11400 次 |
| 最近记录: |