egg*_*yal 13 mysql sql database-design query-optimization data-structures
我有一个相当稳定的有序图~100k顶点和大小~1k边.它是二维的,因为它的顶点可以用一对整数(x, y)(基数~100 x~1000)来识别,并且所有边都严格增加x.
此外,还存在(key, val)与每个顶点相关联的~1k 对的字典.
我目前存储在三个(InnoDB的)表中的MySQL数据库的图形:顶点(我不认为是有关我的问题的表,所以我忽略了包括它,这指的是外键约束它在我的摘录中); 一个包含词典的表格; 和Bill Karwin雄辩地描述的连接顶点的"闭合表".
顶点字典表定义如下:
CREATE TABLE `VertexDictionary` (
`x` smallint(6) unsigned NOT NULL,
`y` smallint(6) unsigned NOT NULL,
`key` varchar(50) NOT NULL DEFAULT '',
`val` smallint(1) DEFAULT NULL,
PRIMARY KEY (`x`, `y` , `key`),
KEY `dict` (`x`, `key`, `val`)
);
Run Code Online (Sandbox Code Playgroud)
和连接顶点的闭包表:
CREATE TABLE `ConnectedVertices` (
`tail_x` smallint(6) unsigned NOT NULL,
`tail_y` smallint(6) unsigned NOT NULL,
`head_x` smallint(6) unsigned NOT NULL,
`head_y` smallint(6) unsigned NOT NULL,
PRIMARY KEY (`tail_x`, `tail_y`, `head_x`),
KEY `reverse` (`head_x`, `head_y`, `tail_x`),
KEY `fx` (`tail_x`, `head_x`),
KEY `rx` (`head_x`, `tail_x`)
);
Run Code Online (Sandbox Code Playgroud)
还存在(x, key)对的字典,使得对于每个这样的对,所有用x它们标识的顶点在其字典内具有该值key.该词典存储在第四个表中:
CREATE TABLE `SpecialKeys` (
`x` smallint(6) unsigned NOT NULL,
`key` varchar(50) NOT NULL DEFAULT '',
PRIMARY KEY (`x`),
KEY `xkey` (`x`, `key`)
);
Run Code Online (Sandbox Code Playgroud)
我经常希望提取具有特定的所有顶点的字典中使用的键集x=X,以及SpecialKeys连接到左侧的任何相关值:
SELECT DISTINCT
`v`.`key`,
`u`.`val`
FROM
`ConnectedVertices` AS `c`
JOIN `VertexDictionary` AS `u` ON (`u`.`x`, `u`.`y` ) = (`c`.`tail_x`, `c`.`tail_y`)
JOIN `VertexDictionary` AS `v` ON (`v`.`x`, `v`.`y` ) = (`c`.`head_x`, `c`.`head_y`)
JOIN `SpecialKeys` AS `k` ON (`k`.`x`, `k`.`key`) = (`u`.`x`, `u`.`key`)
WHERE
`v`.`x` = X
;
Run Code Online (Sandbox Code Playgroud)
的量,EXPLAIN输出是:
id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE k index PRIMARY,xkey xkey 154 NULL 40 Using index; Using temporary 1 SIMPLE c ref PRIMARY,reverse,fx,rx PRIMARY 2 db.k.x 1 Using where 1 SIMPLE v ref PRIMARY,dict PRIMARY 4 const,db.c.head_y 136 Using index 1 SIMPLE u eq_ref PRIMARY,dict PRIMARY 156 db.c.tail_x,db.c.tail_y,db.k.key 1 Using where
但是这个查询需要大约10秒才能完成.一直在撞墙试图改善问题,但无济于事.
可以改进查询,还是应该考虑不同的数据结构?非常感谢你的想法!
UPDATE
我仍然无处可去,虽然我重建了表并发现EXPLAIN输出略有不同(如上所示,从中获取的行数v从1增加到136!); 查询仍然需要大约10秒才能执行.
我真的不明白这里发生了什么.查询获得所有(x, y, SpecialValue)和所有(x, y, key)元组(分别为30毫秒〜和〜150毫秒)都非常快,但基本上是连接两个花费的时间比他们的合并时间超过五十次长...我怎样才能提高执行加入所需的时间?
输出SHOW VARIABLES LIKE '%innodb%';如下:
Variable_name Value ------------------------------------------------------------ have_innodb YES ignore_builtin_innodb ON innodb_adaptive_flushing ON innodb_adaptive_hash_index ON innodb_additional_mem_pool_size 2097152 innodb_autoextend_increment 8 innodb_autoinc_lock_mode 1 innodb_buffer_pool_size 1179648000 innodb_change_buffering inserts innodb_checksums ON innodb_commit_concurrency 0 innodb_concurrency_tickets 500 innodb_data_file_path ibdata1:10M:autoextend innodb_data_home_dir /rdsdbdata/db/innodb innodb_doublewrite ON innodb_fast_shutdown 1 innodb_file_format Antelope innodb_file_format_check Barracuda innodb_file_per_table ON innodb_flush_log_at_trx_commit 1 innodb_flush_method O_DIRECT innodb_force_recovery 0 innodb_io_capacity 200 innodb_lock_wait_timeout 50 innodb_locks_unsafe_for_binlog OFF innodb_log_buffer_size 8388608 innodb_log_file_size 134217728 innodb_log_files_in_group 2 innodb_log_group_home_dir /rdsdbdata/log/innodb innodb_max_dirty_pages_pct 75 innodb_max_purge_lag 0 innodb_mirrored_log_groups 1 innodb_old_blocks_pct 37 innodb_old_blocks_time 0 innodb_open_files 300 innodb_read_ahead_threshold 56 innodb_read_io_threads 4 innodb_replication_delay 0 innodb_rollback_on_timeout OFF innodb_spin_wait_delay 6 innodb_stats_method nulls_equal innodb_stats_on_metadata ON innodb_stats_sample_pages 8 innodb_strict_mode OFF innodb_support_xa ON innodb_sync_spin_loops 30 innodb_table_locks ON innodb_thread_concurrency 0 innodb_thread_sleep_delay 10000 innodb_use_sys_malloc ON innodb_version 1.0.16 innodb_write_io_threads 4
其他人可能不同意,但我已经并定期提供 STRAIGHT_JOIN 查询...一旦您了解数据和关系。由于您的 WHERE 子句针对“V”表别名并且它是“x”值,因此您可以很好地使用索引。将其移至前面位置,然后从该位置加入。
SELECT STRAIGHT_JOIN DISTINCT
v.`key`,
u.`val`
FROM
VertexDictionary AS v
JOIN ConnectedVertices AS c
ON v.x = c.head_x
AND v.y = c.head_y
JOIN VertexDictionary AS u
ON c.tail_x = u.x
AND c.tail_y = u.y
JOIN SpecialKeys AS k
ON u.x = k.x
AND u.key = k.key
WHERE
v.x = {some value}
Run Code Online (Sandbox Code Playgroud)
很想知道这种调整对您有何帮助