如何调整7表连接MySQL计数查询,其中表包含30,000多行?

ege*_*ari 9 mysql sql performance count

我有一个SQL查询,它计算复杂查询的结果数.限制为20个结果时,实际的选择查询速度非常快,但经过大量优化后,计数版本在当前表上大约需要4.5秒.

如果我删除网站标记和图库标记上的两个连接和where子句,则查询将在1.5秒后执行.如果我创建3个单独的查询 - 一个用于选择付费网站,一个用于选择名称,一个用于将所有内容拉到一起 - 我可以将查询缩小到.6秒,这仍然不够好.这也会迫使我使用存储过程,因为我必须在Hibernate中总共进行4次查询.

对于查询"原样",这里有一些信息:

Handler_read_key是1746669.Handler_read_next
是1546324

gallery表有40,000行
.站点表有900行
名称表有800行
.标记表有3560行

我是MySQL和调优的新手,我有以下索引:

  • 标签表中的"term"列
  • 画廊表中的"已发布"列
  • 名称表的'value'

我希望这个查询到0.1毫秒.

SELECT count(distinct gallery.id)
from gallery gallery 
    inner join
        site site 
            on gallery.site_id = site.id 
    inner join
        site_to_tag p2t 
            on site.id = p2t.site_id 
    inner join
        tag site_tag 
            on p2t.tag_id = site_tag.id 
    inner join
        gallery_to_name g2mn 
            on gallery.id = g2mn.gallery_id 
    inner join
        name name 
            on g2mn.name_id = name.id 
    inner join
        gallery_to_tag g2t 
            on gallery.id = g2t.gallery_id 
    inner join
        tag tag 
            on g2t.tag_id = tag.id
where
    gallery.published = true and (
        name.value LIKE 'sometext%' or
        tag.term = 'sometext' or 
        site.`name` like 'sometext%' or
        site_tag.term = 'sometext'
    )
Run Code Online (Sandbox Code Playgroud)

解释数据:

| id | select_type | table        | type   | possible_keys                                                     | key                | key_len | ref                                       | rows | Extra                              |
+----+-------------+--------------+--------+-------------------------------------------------------------------+--------------------+---------+-------------------------------------------+------+------------------------------------+
|  1 | SIMPLE      | site         | index  | PRIMARY,nameIndex                                                 | nameIndex          | 258     | NULL                                      |  950 | Using index; Using temporary       |
|  1 | SIMPLE      | gallery      | ref    | PRIMARY,publishedIndex,FKF44C775296EECE37,publishedSiteIdIndex    | FKF44C775296EECE37 | 9       | production.site.id                        |   20 | Using where                        |
|  1 | SIMPLE      | g2mn         | ref    | PRIMARY,FK3EFFD7F8AFAD7A5E,FK3EFFD7F832C04188                     | FK3EFFD7F8AFAD7A5E | 8       | production.gallery.id                     |    1 | Using index; Distinct              |
|  1 | SIMPLE      | name         | eq_ref | PRIMARY,valueIndex                                                | PRIMARY            | 8       | production.g2mn.name_id                   |    1 | Distinct                           |
|  1 | SIMPLE      | g2t          | ref    | PRIMARY,FK3DDB4D63AFAD7A5E,FK3DDB4D63E210FBA6                     | FK3DDB4D63AFAD7A5E | 8       | production.g2mn.gallery_id                |    2 | Using where; Using index; Distinct |
|  1 | SIMPLE      | tag          | eq_ref | PRIMARY,termIndex                                                 | PRIMARY            | 8       | production.g2t.tag_id                     |    1 | Distinct                           |
|  1 | SIMPLE      | p2t          | ref    | PRIMARY,FK29424AB796EECE37,FK29424AB7E210FBA6                     | PRIMARY            | 8       | production.gallery.site_id                |    3 | Using where; Using index; Distinct |
|  1 | SIMPLE      | site_tag     | eq_ref | PRIMARY,termIndex                                                 | PRIMARY            | 8       | production.p2t.tag_id                     |    1 | Using where; Distinct              |
+----+-------------+--------------+--------+-------------------------------------------------------------------+--------------------+---------+-------------------------------------------+------+------------------------------------+
Run Code Online (Sandbox Code Playgroud)

个人计算速度:

[SQL] select count(*) from gallery;
Affected rows: 0
Time: 0.014ms
Results: 40385

[SQL] 
select count(*) from gallery_to_name;
Affected rows: 0
Time: 0.012ms
Results: 35615

[SQL] 
select count(*) from gallery_to_tag;
Affected rows: 0
Time: 0.055ms
Results: 165104

[SQL] 
select count(*) from tag;
Affected rows: 0
Time: 0.002ms
Results: 3560    

[SQL] 
select count(*) from site;
Affected rows: 0
Time: 0.001ms
Results: 901

[SQL] 
select count(*) from site_to_tag;
Affected rows: 0
Time: 0.003ms
Results: 7026
Run Code Online (Sandbox Code Playgroud)

Mik*_*ike 9

我已经包含了我的测试模式和一个脚本,以便在本文末尾生成测试数据.我已经使用该SQL_NO_CACHE选项来阻止MySQL缓存查询结果 - 这只是用于测试,最终应该被删除.

这与Donnie提出的想法类似,但我已经整理了一点.如果我已正确理解连接,则无需重复每个选择中的所有连接,因为每个连接实际上都独立于其他连接.原WHERE条款规定gallery.published必须为真,然后加入一系列4个条件OR.因此,每个查询都可以单独执行.以下是四个连接:

gallery <--> gallery_to_name <--> name
gallery <--> gallery_to_tag <--> tag
gallery <--> site
gallery <--> site <--> site_to_tag <--> tag
Run Code Online (Sandbox Code Playgroud)

因为gallery包含site_id,在这种情况下,不需要通过site表格进行中间连接.因此,最后一次加入可以简化为:

gallery <--> site_to_tag <--> tag
Run Code Online (Sandbox Code Playgroud)

SELECT单独运行每个,并使用UNION结合结果,非常快.这里的结果假设在这篇文章末尾显示的表结构和索引:

SELECT SQL_NO_CACHE COUNT(id) AS matches FROM (
   (SELECT g.id
    FROM gallery AS g
    INNER JOIN site AS s ON s.id = g.site_id
    WHERE g.published = TRUE AND s.name LIKE '3GRD%')
UNION
   (SELECT g.id
    FROM gallery AS g
    INNER JOIN gallery_to_name AS g2n ON g2n.gallery_id = g.id
    INNER JOIN name AS n ON n.id = g2n.name_id
    WHERE g.published = TRUE AND n.value LIKE '3GRD%')
UNION
   (SELECT g.id
    FROM gallery AS g
    INNER JOIN gallery_to_tag  AS g2t ON g2t.gallery_id = g.id
    INNER JOIN tag AS gt  ON gt.id = g2t.tag_id
    WHERE g.published = TRUE AND gt.term = '3GRD')
UNION
   (SELECT g.id
    FROM gallery AS g
    INNER JOIN site_to_tag AS s2t ON s2t.site_id = g.site_id
    INNER JOIN tag AS st  ON st.id = s2t.tag_id
    WHERE g.published = TRUE AND st.term = '3GRD')
) AS totals;

+---------+
| matches |
+---------+
|      99 |
+---------+
1 row in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)

速度根据搜索条件而有所不同.在下面的示例中,每个表使用不同的搜索值,LIKE运算符必须执行更多工作,因为现在每个表都有更多可能的匹配:

SELECT SQL_NO_CACHE COUNT(id) AS matches FROM (
   (SELECT g.id
    FROM gallery AS g
    INNER JOIN site AS s ON s.id = g.site_id
    WHERE g.published = TRUE AND s.name LIKE '3H%')
UNION
   (SELECT g.id
    FROM gallery AS g
    INNER JOIN gallery_to_name AS g2n ON g2n.gallery_id = g.id
    INNER JOIN name AS n ON n.id = g2n.name_id
    WHERE g.published = TRUE AND n.value LIKE '3G%')
UNION
   (SELECT g.id
    FROM gallery AS g
    INNER JOIN gallery_to_tag  AS g2t ON g2t.gallery_id = g.id
    INNER JOIN tag AS gt  ON gt.id = g2t.tag_id
    WHERE g.published = TRUE AND gt.term = '3IDP')
UNION
   (SELECT g.id
    FROM gallery AS g
    INNER JOIN site_to_tag AS s2t ON s2t.site_id = g.site_id
    INNER JOIN tag AS st  ON st.id = s2t.tag_id
    WHERE g.published = TRUE AND st.term = '3OJX')
) AS totals;

+---------+
| matches |
+---------+
|   12505 |
+---------+
1 row in set (0.24 sec)
Run Code Online (Sandbox Code Playgroud)

这些结果与使用多个连接的查询相比是有利的:

SELECT SQL_NO_CACHE COUNT(DISTINCT g.id) AS matches
FROM gallery AS g
INNER JOIN gallery_to_name AS g2n ON g2n.gallery_id = g.id
INNER JOIN name            AS n   ON n.id = g2n.name_id
INNER JOIN gallery_to_tag  AS g2t ON g2t.gallery_id = g.id
INNER JOIN tag             AS gt  ON gt.id = g2t.tag_id
INNER JOIN site            AS s   ON s.id = g.site_id
INNER JOIN site_to_tag     AS s2t ON s2t.site_id = s.id
INNER JOIN tag             AS st  ON st.id = s2t.tag_id
WHERE g.published = TRUE AND (
    gt.term = '3GRD' OR
    st.term = '3GRD' OR
    n.value LIKE '3GRD%' OR
    s.name LIKE '3GRD%');

+---------+
| matches |
+---------+
|      99 |
+---------+
1 row in set (2.62 sec)

SELECT SQL_NO_CACHE COUNT(DISTINCT g.id) AS matches
FROM gallery AS g
INNER JOIN gallery_to_name AS g2n ON g2n.gallery_id = g.id
INNER JOIN name            AS n   ON n.id = g2n.name_id
INNER JOIN gallery_to_tag  AS g2t ON g2t.gallery_id = g.id
INNER JOIN tag             AS gt  ON gt.id = g2t.tag_id
INNER JOIN site            AS s   ON s.id = g.site_id
INNER JOIN site_to_tag     AS s2t ON s2t.site_id = s.id
INNER JOIN tag             AS st  ON st.id = s2t.tag_id
WHERE g.published = TRUE AND (
    gt.term = '3IDP' OR
    st.term = '3OJX' OR
    n.value LIKE '3G%' OR
    s.name LIKE '3H%');

+---------+
| matches |
+---------+
|   12505 |
+---------+
1 row in set (3.17 sec)
Run Code Online (Sandbox Code Playgroud)

SCHEMA
id列的索引加上site.name,name.value并且tag.term很重要:

DROP SCHEMA IF EXISTS `egervari`;
CREATE SCHEMA IF NOT EXISTS `egervari`;
USE `egervari`;

-- -----------------------------------------------------
-- Table `site`
-- -----------------------------------------------------

DROP TABLE IF EXISTS `site` ;
CREATE  TABLE IF NOT EXISTS `site` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `name` VARCHAR(255) NOT NULL ,
  INDEX `name` (`name` ASC) ,
  PRIMARY KEY (`id`) )
ENGINE = InnoDB;

-- -----------------------------------------------------
-- Table `gallery`
-- -----------------------------------------------------

DROP TABLE IF EXISTS `gallery` ;
CREATE  TABLE IF NOT EXISTS `gallery` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `site_id` INT UNSIGNED NOT NULL ,
  `published` TINYINT(1) NOT NULL DEFAULT 0 ,
  PRIMARY KEY (`id`) ,
  INDEX `fk_gallery_site` (`site_id` ASC) ,
  CONSTRAINT `fk_gallery_site`
    FOREIGN KEY (`site_id` )
    REFERENCES `site` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;

-- -----------------------------------------------------
-- Table `name`
-- -----------------------------------------------------

DROP TABLE IF EXISTS `name` ;
CREATE  TABLE IF NOT EXISTS `name` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `value` VARCHAR(255) NOT NULL ,
  INDEX `value` (`value` ASC) ,
  PRIMARY KEY (`id`) )
ENGINE = InnoDB;

-- -----------------------------------------------------
-- Table `tag`
-- -----------------------------------------------------

DROP TABLE IF EXISTS `tag` ;
CREATE  TABLE IF NOT EXISTS `tag` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
  `term` VARCHAR(255) NOT NULL ,
  INDEX `term` (`term` ASC) ,
  PRIMARY KEY (`id`) )
ENGINE = InnoDB;

-- -----------------------------------------------------
-- Table `gallery_to_name`
-- -----------------------------------------------------

DROP TABLE IF EXISTS `gallery_to_name` ;
CREATE  TABLE IF NOT EXISTS `gallery_to_name` (
  `gallery_id` INT UNSIGNED NOT NULL ,
  `name_id` INT UNSIGNED NOT NULL ,
  PRIMARY KEY (`gallery_id`, `name_id`) ,
  INDEX `fk_gallery_to_name_gallery` (`gallery_id` ASC) ,
  INDEX `fk_gallery_to_name_name` (`name_id` ASC) ,
  CONSTRAINT `fk_gallery_to_name_gallery`
    FOREIGN KEY (`gallery_id` )
    REFERENCES `gallery` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE,
  CONSTRAINT `fk_gallery_to_name_name`
    FOREIGN KEY (`name_id` )
    REFERENCES `name` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;

-- -----------------------------------------------------
-- Table `gallery_to_tag`
-- -----------------------------------------------------

DROP TABLE IF EXISTS `gallery_to_tag` ;
CREATE  TABLE IF NOT EXISTS `gallery_to_tag` (
  `gallery_id` INT UNSIGNED NOT NULL ,
  `tag_id` INT UNSIGNED NOT NULL ,
  PRIMARY KEY (`gallery_id`, `tag_id`) ,
  INDEX `fk_gallery_to_tag_gallery` (`gallery_id` ASC) ,
  INDEX `fk_gallery_to_tag_tag` (`tag_id` ASC) ,
  CONSTRAINT `fk_gallery_to_tag_gallery`
    FOREIGN KEY (`gallery_id` )
    REFERENCES `gallery` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE,
  CONSTRAINT `fk_gallery_to_tag_tag`
    FOREIGN KEY (`tag_id` )
    REFERENCES `tag` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;

-- -----------------------------------------------------
-- Table `site_to_tag`
-- -----------------------------------------------------

DROP TABLE IF EXISTS `site_to_tag` ;
CREATE  TABLE IF NOT EXISTS `site_to_tag` (
  `site_id` INT UNSIGNED NOT NULL ,
  `tag_id` INT UNSIGNED NOT NULL ,
  PRIMARY KEY (`site_id`, `tag_id`) ,
  INDEX `fk_site_to_tag_site` (`site_id` ASC) ,
  INDEX `fk_site_to_tag_tag` (`tag_id` ASC) ,
  CONSTRAINT `fk_site_to_tag_site`
    FOREIGN KEY (`site_id` )
    REFERENCES `site` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE,
  CONSTRAINT `fk_site_to_tag_tag`
    FOREIGN KEY (`tag_id` )
    REFERENCES `tag` (`id` )
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;
Run Code Online (Sandbox Code Playgroud)

测试数据
这将填充site900行,tag包含3560行,name800行和gallery40,000行,并将条目插入到链接表中:

DELIMITER //
DROP PROCEDURE IF EXISTS populate//
CREATE PROCEDURE populate()
BEGIN
    DECLARE i INT DEFAULT 0;

    WHILE i < 900 DO
        INSERT INTO site (name) VALUES (CONV(i + 1 * 10000, 20, 36));
        SET i = i + 1;
    END WHILE;

    SET i = 0;
    WHILE i < 3560 DO
        INSERT INTO tag (term) VALUES (CONV(i + 1 * 10000, 20, 36));
        INSERT INTO site_to_tag (site_id, tag_id) VALUES ( (i MOD 900) + 1, i + 1 );
        SET i = i + 1;
    END WHILE;

    SET i = 0;
    WHILE i < 800 DO
        INSERT INTO name (value) VALUES (CONV(i + 1 * 10000, 20, 36));
        SET i = i + 1;
    END WHILE;

    SET i = 0;
    WHILE i < 40000 DO    
        INSERT INTO gallery (site_id, published) VALUES ( (i MOD 900) + 1, i MOD 2 );
        INSERT INTO gallery_to_name (gallery_id, name_id) VALUES ( i + 1, (i MOD 800) + 1 );
        INSERT INTO gallery_to_tag (gallery_id, tag_id) VALUES ( i + 1, (i MOD 3560) + 1 );
        SET i = i + 1;
    END WHILE;
END;
//
DELIMITER ;
CALL populate();
Run Code Online (Sandbox Code Playgroud)