在MySQL中使用内连接表上的索引

Yur*_*lov 10 mysql indexing inner-join large-data

我有2亿条记录的表Foo和1000条记录的表格栏,它们是多对一连接的.列Foo.someTime和Bar.someField有索引.同样在Bar 900中,记录的某些字段为1,100,其中某些字段为2.

(1)此查询立即执行:

mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where f.someTime     between '2008-08-14' and '2018-08-14' and b.someField = 1 limit 20;
...
20 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)

(2)这个只需要永远(唯一的变化是b.someField = 2):

mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where f.someTime     between '2008-08-14' and '2018-08-14' and b.someField = 2 limit 20;
Run Code Online (Sandbox Code Playgroud)

(3)但是如果我在某个时间删除where子句而不是立即执行:

mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where b.someField = 2 limit 20;
...
20 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)

(4)我也可以通过强制索引使用来加快速度:

mysql> select * from Foo f inner join Bar b force index(someField) on f.table_id = b.table_id where f.someTime     between '2008-08-14' and '2018-08-14' and b.someField = 2 limit 20;
...
20 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)

这是关于查询(2)的解释(这需要永远)

+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
| id | select_type | table | type   | possible_keys                 | key       | key_len | ref                      | rows     | Extra       |
+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
|  1 | SIMPLE      | g     | range  | bar_id,bar_id_2,someTime      | someTime  | 4       | NULL                     | 95022220 | Using where |
|  1 | SIMPLE      | t     | eq_ref | PRIMARY,someField,bar_id      | PRIMARY   | 4       | db.f.bar_id              |        1 | Using where |
+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
Run Code Online (Sandbox Code Playgroud)

这是关于(4)的解释(有力指数)

+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
| id | select_type | table | type | possible_keys                 | key       | key_len | ref                      | rows     | Extra       |
+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
|  1 | SIMPLE      | t     | ref  | someField                     | someField | 1       |   const                  |       92 |             |
|  1 | SIMPLE      | g     | ref  | bar_id,bar_id_2,someTime      | bar_id    | 4       | db.f.foo_id              | 10558024 | Using where |
+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
Run Code Online (Sandbox Code Playgroud)

那么问题是如何教MySQL使用正确的索引?查询由ORM生成,并不仅限于这两个字段.而且避免更改查询会很好(尽管我不确定内连接是否适合这里).

更新:

mysql> create index index_name on Foo (bar_id, someTime);
Run Code Online (Sandbox Code Playgroud)

之后,查询(2)以0.00秒执行.

mvp*_*mvp 5

如果你创建复合索引foo(table_id, sometime),它应该会有很大帮助.这是因为服务器将能够先缩小结果集table_id,然后再缩小sometime.

请注意,在使用时LIMIT,如果许多行符合WHERE约束条件,则服务器不保证将获取哪些行.从技术上讲,每次执行都会给你带来稍微不同的结果.如果你想避免歧义,你应该在使用ORDER BY时随时使用LIMIT.但是,这也意味着您应该更加谨慎地创建适当的索引.