fgu*_*len 5 mysql sql indexing select
我有这张桌子(简化版)
create table completions (
id int(11) not null auto_increment,
completed_at datetime default null,
is_mongo_synced tinyint(1) default '0',
primary key (id),
key index_completions_on_completed_at_and_is_mongo_synced_and_id (completed_at,is_mongo_synced,id),
) engine=innodb auto_increment=4785424 default charset=utf8 collate=utf8_unicode_ci;
Run Code Online (Sandbox Code Playgroud)
尺寸:
select count(*) from completions; -- => 4817574
Run Code Online (Sandbox Code Playgroud)
现在我尝试执行此查询:
select completions.*
from completions
where
(completed_at is not null)
and completions.is_mongo_synced = 0
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
它需要9 分钟.
我看到没有使用任何索引,explain extend返回此信息:
id: 1
select_type: SIMPLE
table: completions
type: index
possible_keys: index_completions_on_completed_at_and_is_mongo_synced_and_id
key: PRIMARY
key_len: 4
ref: NULL
rows: 20
filtered: 11616415.00
Extra: Using where
Run Code Online (Sandbox Code Playgroud)
如果我强制索引:
select completions.*
from completions
force index(index_completions_on_completed_at_and_is_mongo_synced_and_id)
where
(completed_at is not null)
and completions.is_mongo_synced = 0
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
需要1,22s,这要好得多.的explain extend回报:
id: 1
select_type: SIMPLE
table: completions
type: range
possible_keys: index_completions_on_completed_at_and_is_mongo_synced_and_id
key: index_completions_on_completed_at_and_is_mongo_synced_and_id
key_len: 6
ref: null
rows: 2323334
filtered: 100
Extra: Using index condition; Using filesort
Run Code Online (Sandbox Code Playgroud)
现在如果我通过以下方式缩小查询范围completions.id:
select completions.*
from completions
force index(index_completions_on_completed_at_and_is_mongo_synced_and_id)
where
(completed_at is not null)
and completions.is_mongo_synced = 0
and completions.id > 2000000
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
它需要1,31s,仍然很好.的explain extend回报:
id: 1
select_type: SIMPLE
table: completions
type: range
possible_keys: index_completions_on_completed_at_and_is_mongo_synced_and_id
key: index_completions_on_completed_at_and_is_mongo_synced_and_id
key_len: 6
ref: null
rows: 2323407
filtered: 100
Extra: Using index condition; Using filesort
Run Code Online (Sandbox Code Playgroud)
关键是如果对于最后一个查询我不强制索引:
select completions.*
from completions
where
(completed_at is not null)
and completions.is_mongo_synced = 0
and completions.id > 2000000
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
它需要85ms,检查它是ms而不是s.的explain extend回报:
id: 1
select_type: SIMPLE
table: completions
type: range
possible_keys: PRIMARYindex_completions_on_completed_at_and_is_mongo_synced_and_id
key: PRIMARY
key_len: 4
ref: null
rows: 2323451
filtered: 100
Extra: Using where
Run Code Online (Sandbox Code Playgroud)
这不仅令我感到疯狂,而且还因为过滤器数量的微小变化,最后一个查询的性能受到很大影响:
select completions.*
from completions
where
(completed_at is not null)
and completions.is_mongo_synced = 0
and completions.id > 1600000
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
需要13秒
我不明白的事情:
查询A:
select completions.*
from completions
where
(completed_at is not null)
and completions.is_mongo_synced = 0
and completions.id > 2000000
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
85ms
查询B:
select completions.*
from completions
force index(index_completions_on_completed_at_and_is_mongo_synced_and_id)
where
(completed_at is not null)
and completions.is_mongo_synced = 0
and completions.id > 2000000
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
1,31s
查询A:
select completions.*
from completions
where
(completed_at is not null)
and completions.is_mongo_synced = 0
and completions.id > 2000000
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
85ms
查询B:
select completions.*
from completions
where
(completed_at is not null)
and completions.is_mongo_synced = 0
and completions.id > 1600000
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
13S
指数:
key index_completions_on_completed_at_and_is_mongo_synced_and_id (completed_at,is_mongo_synced,id),
Run Code Online (Sandbox Code Playgroud)
查询:
select completions.*
from completions
force index(index_completions_on_completed_at_and_is_mongo_synced_and_id)
where
(completed_at is not null)
and completions.is_mongo_synced = 0
and completions.id > 2000000
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
评论中要求提供更多数据
基于is_mongo_synced值
的行数 select
completions.is_mongo_synced,
count(*)
from completions
group by completions.is_mongo_synced;
Run Code Online (Sandbox Code Playgroud)
结果:
[
{
"is_mongo_synced":0,
"count(*)":2731921
},
{
"is_mongo_synced":1,
"count(*)":2087869
}
]
Run Code Online (Sandbox Code Playgroud)
没有的查询 order by
select completions.*
from completions
where
(completed_at is not null)
and completions.is_mongo_synced = 0
and completions.id > 2000000
limit 10;
Run Code Online (Sandbox Code Playgroud)
544ms
select completions.*
from completions
force index(index_completions_on_completed_at_and_is_mongo_synced_and_id)
where
(completed_at is not null)
and completions.is_mongo_synced = 0
and completions.id > 2000000
limit 10;
Run Code Online (Sandbox Code Playgroud)
314ms
但是,无论如何,我需要订单,因为我正在逐批扫描表.
你的问题相当复杂。但是,您的第一个查询:
select completions.*
from completions
where completed_at is not null and
completions.is_mongo_synced = 0
order by completions.id asc
limit 10;
Run Code Online (Sandbox Code Playgroud)
上的最佳索引(is_mongo_synced, completed_at)。可能还有其他方法来编写查询,但在您强制的索引中,列的顺序不是最佳的。
第二个查询中的性能差异可能是因为数据实际上正在排序。额外的数十万行可能会影响排序时间。对 值的依赖id可能是索引不被使用的原因。如果将索引更改为(is_mongo_synced, id, completed_at),则索引使用的可能性会更大。
MySQL 有很好的关于复合索引的文档。您可能想在这里查看它。
添加索引后:
KEY `index_completions_on_is_mongo_synced_and_id_and_completed_at` (`is_mongo_synced`,`id`,`completed_at`) USING BTREE,
Run Code Online (Sandbox Code Playgroud)
并再次执行长查询
select completions.*
from completions
where
(completed_at is not null)
and completions.is_mongo_synced = 0
order by completions.id asc limit 10;
Run Code Online (Sandbox Code Playgroud)
需要156ms,这非常好。
检查explain extended我们看到 MySQL 正在使用正确的索引:
id: 1
select_type: SIMPLE
table: completions
type: ref
possible_keys: index_completions_on_completed_at_and_is_mongo_synced_and_id,index_completions_on_is_mongo_synced_and_id_and_completed_at
key: index_completions_on_is_mongo_synced_and_id_and_completed_at
key_len: 2
ref: const
rows: 1626322
filtered: 100
Extra: Using index condition; Using where
Run Code Online (Sandbox Code Playgroud)