Iva*_*van 5 mysql sql query-optimization
表
商店(100,000行):id(pk),name,lat,lng,...
store_items(9,000,000行):store_id(fk),item_id(fk)
items(200,000行):id(pk),name,...
item_words(1,000,000行):item_id(fk),word_id(fk)
单词(50,000行):id(pk),单词VARCHAR(255)
注意:所有id都是整数.
========
索引
CREATE UNIQUE INDEX storeitems_storeid_itemid_i ON store_items(store_id,item_id);
CREATE UNIQUE INDEX itemwords_wordid_itemid_i ON item_words(word_id,item_id);
CREATE UNIQUE INDEX words_word_i ON words(word);
注意:我更喜欢多列索引(storeitems_storeid_itemid_i和itemwords_wordid_itemid_i),因为:http://www.mysqlperformanceblog.com/2008/08/22/multiple-column-index-vs-multiple-indexes/
select s.name, s.lat, s.lng, i.name
from words w, item_words iw, items i, store_items si, stores s
where iw.word_id=w.id
and i.id=iw.item_id
and si.item_id=i.id
and s.id=si.store_id
and w.word='MILK';
Run Code Online (Sandbox Code Playgroud)
explain $QUERY$
+----+-------------+-------+--------+-------------------------------------------------------+-----------------------------+---------+-----------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------------------------------------+-----------------------------+---------+-----------------------------+------+-------------+
| 1 | SIMPLE | w | const | PRIMARY,words_word_i | words_word_i | 257 | const | 1 | Using index |
| 1 | SIMPLE | iw | ref | itemwords_wordid_itemid_i,itemwords_itemid_fk | itemwords_wordid_itemid_i | 4 | const | 1 | Using index |
| 1 | SIMPLE | i | eq_ref | PRIMARY | PRIMARY | 4 | iw.item_id | 1 | |
| 1 | SIMPLE | si | ref | storeitems_storeid_itemid_i,storeitems_itemid_fk | storeitems_itemid_fk | 4 | iw.item_id | 16 | Using index |
| 1 | SIMPLE | s | eq_ref | PRIMARY | PRIMARY | 4 | si.store_id | 1 | |
Run Code Online (Sandbox Code Playgroud)
==============
我试图通过向查询添加表来查看执行时间的增加.
select * from words where word='MILK';
Elapsed time: 0.4 sec
Run Code Online (Sandbox Code Playgroud)
select count(*)
from words w, item_words iw
where iw.word_id=w.id
and w.word='MILK';
Elapsed time: 0.5-2 sec (depending on word)
Run Code Online (Sandbox Code Playgroud)
select count(*)
from words w, item_words iw, items i
where iw.word_id=w.id
and i.id=iw.item_id
and w.word='MILK';
Elapsed time: 0.5-2 sec (depending on word)
Run Code Online (Sandbox Code Playgroud)
select count(*)
from words w, item_words iw, items i, store_items si
where iw.word_id=w.id
and i.id=iw.item_id
and si.item_id=i.id
and w.word='MILK';
Elapsed time: 20-120 sec (depending on word)
Run Code Online (Sandbox Code Playgroud)
我猜测索引的问题或查询/数据库的设计.但必须有办法让它快速运作.谷歌以某种方式做到了,他们的桌子更大!
如果给定了 item_id,您没有可用于查找 store_id 的索引。如果 store_id 的基数足够低,它可能会从 storeitems_storeid_itemid_i 中获得一些好处,但由于您有 100,000 个商店,这可能不太有用。您可以尝试在 store_items 上创建一个索引,首先列出 item_id:
CREATE UNIQUE INDEX storeitems_item_store ON store_items(item_id, store_id);
Run Code Online (Sandbox Code Playgroud)
另外,我不确定将连接条件放在 where 子句中是否会对性能产生不利影响,如您所见,但您可以尝试将查询更改为如下所示:
select s.name, s.lat, s.lng, i.name
from words w LEFT JOIN item_words iw ON w.id=iw.word_id
LEFT JOIN items i ON i.id=iw.item_id
LEFT JOIN store_items si ON si.item_id=i.id
LEFT JOIN stores s ON s.id=si.store_id
where w.word='MILK';
Run Code Online (Sandbox Code Playgroud)