MySQL 子查询显着减慢,但它们独立工作正常

Ste*_*wie 9 mysql innodb performance optimization subquery query-performance

查询 1:

select distinct email from mybigtable where account_id=345
Run Code Online (Sandbox Code Playgroud)

需要 0.1 秒

查询 2:

Select count(*) as total from mybigtable where account_id=123 and email IN (<include all from above result>)
Run Code Online (Sandbox Code Playgroud)

需要 0.2 秒

查询 3:

Select count(*) as total from mybigtable where account_id=123 and email IN (select distinct email from mybigtable where account_id=345)
Run Code Online (Sandbox Code Playgroud)

需要 22 分钟,其中 90% 处于“准备”状态。为什么要花这么多时间。

表是 innodb,在 MySQL 5.0 上有 320 万行

Rol*_*DBA 10

在查询 3 中,您基本上是针对 mybigtable 的每一行对其自身执行子查询。

为避免这种情况,您需要进行两个主要更改:

主要变化 #1:重构查询

这是您的原始查询

Select count(*) as total from mybigtable
where account_id=123 and email IN
(select distinct email from mybigtable where account_id=345)
Run Code Online (Sandbox Code Playgroud)

你可以试试

select count(*) EmailCount from
(
    select tbl123.email from
    (select email from mybigtable where account_id=123) tbl123
    INNER JOIN
    (select distinct email from mybigtable where account_id=345) tbl345
    using (email)
) A;
Run Code Online (Sandbox Code Playgroud)

或者每封电子邮件的数量

select email,count(*) EmailCount from
(
    select tbl123.email from
    (select email from mybigtable where account_id=123) tbl123
    INNER JOIN
    (select distinct email from mybigtable where account_id=345) tbl345
    using (email)
) A group by email;
Run Code Online (Sandbox Code Playgroud)

主要变化 #2:正确的索引

我认为您已经有了这个,因为查询 1 和查询 2 运行得很快。确保您在 (account_id,email) 上有一个复合索引。做SHOW CREATE TABLE mybigtable\G并确保你有一个。如果您没有它或者您不确定,那么无论如何都要创建索引:

ALTER TABLE mybigtable ADD INDEX account_id_email_ndx (account_id,email);
Run Code Online (Sandbox Code Playgroud)

更新 2012-03-07 13:26 EST

如果你想做一个 NOT IN(),把 theINNER JOIN改为 aLEFT JOIN并检查右边是否为 NULL,如下所示:

select count(*) EmailCount from
(
    select tbl123.email from
    (select email from mybigtable where account_id=123) tbl123
    LEFT JOIN
    (select distinct email from mybigtable where account_id=345) tbl345
    using (email)
    WHERE tbl345.email IS NULL
) A;
Run Code Online (Sandbox Code Playgroud)

更新 2012-03-07 14:13 美国东部时间

请阅读这两个有关执行 JOIN 的链接

这是一个很棒的 YouTube 视频,在那里我学会了重构查询以及它基于的书


Aar*_*own 10

在 MySQL 中,对于外部查询中的每一行,都将重新执行 IN 子句中的子选择,从而创建 O(n^2)。简短的故事是,不要使用 IN (SELECT)。