我有一个带有两个表的 sqlite 数据库,每个表有 50,000 行,包含(假)人的名字。我构建了一个简单的查询来找出有多少个名字(名字、中间名首字母、姓氏)是两个表共有的:
select count(*) from fakenames_uk inner join fakenames_usa on fakenames_uk.givenname=fakenames_usa.givenname and fakenames_uk.surname=fakenames_usa.surname and fakenames_uk.middleinitial=fakenames_usa.middleinitial;
Run Code Online (Sandbox Code Playgroud)
当除了主键上没有索引(与此查询无关)时,它运行得很快:
[james@marlon Downloads] $ time sqlite3 generic_data_no_indexes.sqlite "select count(*) from fakenames_uk inner join fakenames_usa on fakenames_uk.givenname=fakenames_usa.givenname and fakenames_uk.surname=fakenames_usa.surname and fakenames_uk.middleinitial=fakenames_usa.middleinitial;"
131
real 0m0.115s
user 0m0.111s
sys 0m0.004s
Run Code Online (Sandbox Code Playgroud)
但是如果我为每个表的三列添加索引(总共六个索引):
CREATE INDEX `idx_uk_givenname` ON `fakenames_uk` (`givenname` )
//etc.
Run Code Online (Sandbox Code Playgroud)
然后它运行得很慢:
[james@marlon Downloads] $ time sqlite3 generic_data.sqlite "select count(*) from fakenames_uk inner join fakenames_usa on fakenames_uk.givenname=fakenames_usa.givenname and fakenames_uk.surname=fakenames_usa.surname and fakenames_uk.middleinitial=fakenames_usa.middleinitial;"
131
real 1m43.102s
user 0m52.397s
sys …
Run Code Online (Sandbox Code Playgroud)