Yur*_*lov 10 mysql indexing inner-join large-data
我有2亿条记录的表Foo和1000条记录的表格栏,它们是多对一连接的.列Foo.someTime和Bar.someField有索引.同样在Bar 900中,记录的某些字段为1,100,其中某些字段为2.
(1)此查询立即执行:
mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where f.someTime between '2008-08-14' and '2018-08-14' and b.someField = 1 limit 20;
...
20 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)
(2)这个只需要永远(唯一的变化是b.someField = 2):
mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where f.someTime between '2008-08-14' and '2018-08-14' and b.someField = 2 limit 20;
Run Code Online (Sandbox Code Playgroud)
(3)但是如果我在某个时间删除where子句而不是立即执行:
mysql> select * from Foo f inner join Bar b on f.table_id = b.table_id where b.someField = 2 limit 20;
...
20 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)
(4)我也可以通过强制索引使用来加快速度:
mysql> select * from Foo f inner join Bar b force index(someField) on f.table_id = b.table_id where f.someTime between '2008-08-14' and '2018-08-14' and b.someField = 2 limit 20;
...
20 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)
这是关于查询(2)的解释(这需要永远)
+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
| 1 | SIMPLE | g | range | bar_id,bar_id_2,someTime | someTime | 4 | NULL | 95022220 | Using where |
| 1 | SIMPLE | t | eq_ref | PRIMARY,someField,bar_id | PRIMARY | 4 | db.f.bar_id | 1 | Using where |
+----+-------------+-------+--------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
Run Code Online (Sandbox Code Playgroud)
这是关于(4)的解释(有力指数)
+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
| 1 | SIMPLE | t | ref | someField | someField | 1 | const | 92 | |
| 1 | SIMPLE | g | ref | bar_id,bar_id_2,someTime | bar_id | 4 | db.f.foo_id | 10558024 | Using where |
+----+-------------+-------+------+-------------------------------+-----------+---------+--------------------------+----------+-------------+
Run Code Online (Sandbox Code Playgroud)
那么问题是如何教MySQL使用正确的索引?查询由ORM生成,并不仅限于这两个字段.而且避免更改查询会很好(尽管我不确定内连接是否适合这里).
更新:
mysql> create index index_name on Foo (bar_id, someTime);
Run Code Online (Sandbox Code Playgroud)
之后,查询(2)以0.00秒执行.
如果你创建复合索引foo(table_id, sometime),它应该会有很大帮助.这是因为服务器将能够先缩小结果集table_id,然后再缩小sometime.
请注意,在使用时LIMIT,如果许多行符合WHERE约束条件,则服务器不保证将获取哪些行.从技术上讲,每次执行都会给你带来稍微不同的结果.如果你想避免歧义,你应该在使用ORDER BY时随时使用LIMIT.但是,这也意味着您应该更加谨慎地创建适当的索引.
| 归档时间: |
|
| 查看次数: |
20944 次 |
| 最近记录: |