我知道使用索引可以帮助加速两个或多个表的连接。以下示例使用共享的 Department_id 列连接两个表 emps 和 depts:
select last_name, department_name
from emps join depts
using(department_id);
Run Code Online (Sandbox Code Playgroud)
我的问题是:对两个表之一上的 Department_id 列进行索引会加快此查询的速度,还是必须在两个表的两个 Department_id 列上创建索引才能看到性能的提高?
这两个表自然会有一个索引department_id,因为这应该是depts主键和emps外键。
In your query, it is rather unlikely that the indexes will be used, though. Why should the DBMS bother to scan index trees when it's finally about all records to read? Simple sequential full table scans and then a join on hashes for instance will usually be much faster.
Let's look at another example:
select e.last_name, d.department_name
from emps e
join depts d on d.department_id = e.department_id
where e.first_name = 'Laura';
Run Code Online (Sandbox Code Playgroud)
Here, we are only interested in few employees. This is where indexes come into play. We'll want an index on emps(first_name). Then we'll know the employee record, the department_id, and we can access the associated dept record.
But saying this, we notice that we use the index to look up the table record to look up the department_id. Wouldn't it be faster to get the department_id right from the index? Yes it would. So the index should be on emps(first_name, department_id).
The depts primary key is department_id, so this column is indexed, and we can easily find the depts record with the department name.
But we can ask the same question again: Can't we get the name right from the index, too? This leads us to covering indexes that contain all columns used in a query.
So, while
index idx_emps on emps(first_name, department_id)
index idx_depts on depts(department_id)
Run Code Online (Sandbox Code Playgroud)
suffice, we can get the query still faster with these covering indexes:
index idx_emps on emps(first_name, department_id, last_name)
index idx_depts on depts(department_id, department_name)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5359 次 |
| 最近记录: |