Cha*_*lie 5 mysql database-design
作为前言,我对数据库设计不是很有经验.我有一张哈希和ids表.添加一组新哈希时,组中的每一行都会获得相同的ID.如果数据库中已存在新组中的任何哈希,则新组和现有组中的所有哈希都会获得一个新的共享ID(在重复哈希时有效地合并ID):
INSERT INTO hashes
(id, hash)
VALUES
($new_id, ...), ($new_id, ...)
ON DUPLICATE KEY UPDATE
repeat_count = repeat_count + 1;
INSERT INTO hashes_lookup SELECT DISTINCT id FROM hashes WHERE hash IN (...);
UPDATE hashes JOIN hashes_lookup USING (id) SET id = '$new_id';
TRUNCATE TABLE hashes_lookup;
Run Code Online (Sandbox Code Playgroud)
其他表引用了这些id,因此如果id发生更改,则外键约束会负责更新表中的id.但是,这里的问题是我无法在任何子表中强制实现唯一性.如果我这样做,我的查询失败:
表'...'的外键约束,记录'...'将导致表'...'中的重复条目
这个错误是有道理的,给定以下测试用例,其中id和value是复合唯一键:
id | value
---+-------
a | 1
b | 2
c | 1
Run Code Online (Sandbox Code Playgroud)
然后a变为c:
id | value
---+-------
c | 1
b | 2
c | 1
Run Code Online (Sandbox Code Playgroud)
但c,1已经存在.
如果存在ON UPDATE IGNORE CASCADE选项,那将是理想的,因此如果存在重复行,则忽略任何重复插入.但是,我很确定这里真正的问题是我的数据库设计,所以我对所有建议持开放态度.我目前的解决方案是不强制跨子表的唯一性,这会导致大量冗余行.
编辑:
CREATE TABLE `hashes` (
`hash` char(64) NOT NULL,
`id` varchar(128) NOT NULL,
`repeat_count` int(11) NOT NULL DEFAULT '0',
`insert_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`update_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
UNIQUE KEY `hash` (`hash`) USING BTREE,
KEY `id` (`id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `emails` (
`id` varchar(128) NOT NULL,
`group_id` char(5) NOT NULL,
`email` varchar(500) NOT NULL,
KEY `index` (`id`) USING BTREE,
UNIQUE KEY `id` (`id`,`group_id`,`email`(255)) USING BTREE,
CONSTRAINT `emails_ibfk_1` FOREIGN KEY (`id`) REFERENCES `hashes` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
Run Code Online (Sandbox Code Playgroud)
我们在聊天中找到了解决方案:
/* Tables */
CREATE TABLE `emails` (
`group_id` bigint(20) NOT NULL,
`email` varchar(500) NOT NULL,
UNIQUE KEY `group_id` (`group_id`,`email`) USING BTREE,
CONSTRAINT `emails_ibfk_1` FOREIGN KEY (`group_id`) REFERENCES `entities` (`group_id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `hashes` (
`group_id` bigint(20) NOT NULL,
`hash` varchar(128) NOT NULL,
`repeat_count` int(11) NOT NULL DEFAULT '0',
UNIQUE KEY `hash` (`hash`),
KEY `group_id` (`group_id`),
CONSTRAINT `hashes_ibfk_1` FOREIGN KEY (`group_id`) REFERENCES `entities` (`group_id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `entities` (
`group_id` bigint(20) NOT NULL,
`entity_id` bigint(20) NOT NULL,
PRIMARY KEY (`group_id`),
KEY `entity_id` (`entity_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `entity_lookup` (
`entity_id` bigint(20) NOT NULL,
PRIMARY KEY (`entity_id`) USING HASH
) ENGINE=MyISAM DEFAULT CHARSET=latin1
/* Inserting */
START TRANSACTION;
/* Determine next group ID */
SET @next_group_id = (SELECT MAX(group_id) + 1 FROM entities);
/* Determine next entity ID */
SET @next_entity_id = (SELECT MAX(entity_id) + 1 FROM entities);
/* Merge any entity ids */
INSERT IGNORE INTO entity_lookup SELECT entity_id FROM entities JOIN hashes USING(group_id) WHERE HASH IN(...);
UPDATE entities JOIN entity_lookup USING(entity_id) SET entity_id = @next_entity_id;
TRUNCATE TABLE entity_lookup;
/* Add the new group ID to entity_id */
INSERT INTO entities(group_id, entity_id) VALUES(@next_group_id, @next_entity_id);
/* Add new values into hashes */
INSERT INTO hashes (group_id, HASH) VALUES
(@next_group_id, ...)
ON DUPLICATE KEY UPDATE
repeat_count = repeat_count + 1;
/* Add other new values */
INSERT IGNORE INTO emails (group_id, email) VALUES
(@next_group_id, "email1");
COMMIT;
Run Code Online (Sandbox Code Playgroud)