跨表索引是否可行？

Question

跨表索引是否可行？

考虑一种结构,在这种结构中,您与两个表上的条件(where,order by等)具有多对一(或一对多)关系.例如:

CREATE TABLE tableTwo (
    id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    eventTime DATETIME NOT NULL,
    INDEX (eventTime)
) ENGINE=InnoDB;

CREATE TABLE tableOne (
    id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    tableTwoId INT UNSIGNED NOT NULL,
    objectId INT UNSIGNED NOT NULL,
    INDEX (objectID),
    FOREIGN KEY (tableTwoId) REFERENCES tableTwo (id)
) ENGINE=InnoDB;

Run Code Online (Sandbox Code Playgroud)

并为一个示例查询:

select * from tableOne t1 
  inner join tableTwo t2 on t1.tableTwoId = t2.id
  where objectId = '..'
  order by eventTime;

Run Code Online (Sandbox Code Playgroud)

让我们说你索引tableOne.objectId和tableTwo.eventTime.如果您然后解释上面的查询,它将显示"使用filesort".本质上,它首先应用tableOne.objectId索引,但它不能应用tableTwo.eventTime索引,因为该索引是tableTwo的整体(不是有限的结果集),因此它必须进行手动排序.

那么,有没有办法做一个跨表索引,所以每次检索结果时都不需要文件排序？ 就像是:

create index ind_t1oi_t2et on tableOne t1 
  inner join tableTwo t2 on t1.tableTwoId = t2.id 
  (t1.objectId, t2.eventTime);

Run Code Online (Sandbox Code Playgroud)

此外,我已经研究了创建视图和索引,但视图不支持索引.

如果不能进行跨表索引,我一直倾向于使用的解决方案是在一个表中复制条件数据.在这种情况下,eventTime将复制这意味着并且将tableOne建立多列索引tableOne.objectId并且tableOne.eventTime(基本上手动创建索引).但是,我想我先找出其他人的经验,看看这是不是最好的方法.

非常感谢!

更新:

以下是加载测试数据和比较结果的一些过程:

drop procedure if exists populate_table_two;
delimiter #
create procedure populate_table_two(IN numRows int)
begin
declare v_counter int unsigned default 0;
  while v_counter < numRows do
    insert into tableTwo (eventTime) 
    values (CURRENT_TIMESTAMP - interval 0 + floor(0 + rand()*1000) minute);
    set v_counter=v_counter+1;
  end while;
end #
delimiter ;

drop procedure if exists populate_table_one;
delimiter #
create procedure populate_table_one
   (IN numRows int, IN maxTableTwoId int, IN maxObjectId int)
begin
declare v_counter int unsigned default 0;
  while v_counter < numRows do
    insert into tableOne (tableTwoId, objectId) 
      values (floor(1 +(rand() * maxTableTwoId)), 
              floor(1 +(rand() * maxObjectId)));
    set v_counter=v_counter+1;
  end while;
end #
delimiter ;

Run Code Online (Sandbox Code Playgroud)

您可以使用以下方法填充10,000行tableTwo和20,000行tableOne(随机引用tableOne和随机objectIds在1和5之间),分别需要26.2和70.77秒才能运行:

call populate_table_two(10000);
call populate_table_one(20000, 10000, 5);

Run Code Online (Sandbox Code Playgroud)

更新2(经过测试的触发SQL):

下面是基于daniHp触发方法的久经考验的SQL.这可以dateTime在添加或更新tableOne时保持同步.此外,如果将条件列复制到连接表,则此方法也应适用于多对多关系.在我测试300,000行和200,000行中,具有类似限制的旧查询的速度为0.12秒,新查询的速度仍显示为0.00秒.因此,有一个明显的改进,这种方法应该在数百万行和更远的行中表现良好.tableOnetableTwotableOnetableTwo

alter table tableOne add column tableTwo_eventTime datetime;

create index ind_t1_oid_t2et on tableOne (objectId, tableTwo_eventTime);

drop TRIGGER if exists t1_copy_t2_eventTime;
delimiter #
CREATE TRIGGER t1_copy_t2_eventTime
   BEFORE INSERT ON tableOne
for each row
begin
  set NEW.tableTwo_eventTime = (select eventTime 
       from tableTwo t2
       where t2.id = NEW.tableTwoId);
end #
delimiter ;

drop TRIGGER if exists upd_t1_copy_t2_eventTime;
delimiter #
CREATE TRIGGER upd_t1_copy_t2_eventTime
   BEFORE UPDATE ON tableTwo
for each row
begin
  update tableOne 
    set tableTwo_eventTime = NEW.eventTime 
    where tableTwoId = NEW.id;
end #
delimiter ;

Run Code Online (Sandbox Code Playgroud)

和更新的查询:

select * from tableOne t1 
  inner join tableTwo t2 on t1.tableTwoId = t2.id
  where t1.objectId = 1
  order by t1.tableTwo_eventTime desc limit 0,10;

Run Code Online (Sandbox Code Playgroud)

Answer 1

dan*_*era 8

如您所知,SQLServer通过索引视图实现了这一点:

索引视图提供了使用标准索引无法实现的额外性能优势.索引视图可以通过以下方式提高查询性能:

聚合可以预先计算并存储在索引中,以最大限度地减少查询执行期间的昂贵计算.

可以预先连接表,并存储结果数据集.

可以存储联接或聚合的组合.

在SQLServer中,要利用此技术,必须在视图上查询,而不是在表上查询.这意味着您应该了解视图和索引.

MySQL没有索引视图,但您可以使用表+触发器+索引模拟行为.

您必须创建一个索引表,一个使数据表保持最新的触发器,而不是创建一个视图,然后您必须查询新表而不是规范化表.

您必须评估写入操作的开销是否抵消了读取操作的改进.

编辑:

请注意,并不总是需要创建新表.例如,在1:N关系(主 - 详细信息)触发器中,您可以将"主"表中的字段副本保留到"详细信息"表中.在你的情况下:

CREATE TABLE tableOne (
    id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    tableTwoId INT UNSIGNED NOT NULL,
    objectId INT UNSIGNED NOT NULL,
    desnormalized_eventTime DATETIME NOT NULL,
    INDEX (objectID),
    FOREIGN KEY (tableTwoId) REFERENCES tableTwo (id)
) ENGINE=InnoDB;

CREATE TRIGGER tableOne_desnormalized_eventTime
   BEFORE INSERT ON tableOne
for each row
begin
  DECLARE eventTime DATETIME;
  SET eventTime = 
      (select eventTime 
       from tableOne
       where tableOne.id = NEW.tableTwoId);
  NEW.desnormalized_eventTime = eventTime;
end;

Run Code Online (Sandbox Code Playgroud)

请注意,这是一个插入前触发器.

现在,查询重写如下:

select * from tableOne t1 
  inner join tableTwo t2 on t1.tableTwoId = t2.id
  where t1.objectId = '..'
  order by t1.desnormalized_eventTime;

Run Code Online (Sandbox Code Playgroud)

免责声明:未经测试.

归档时间：	14 年，1 月前
查看次数：	9872 次
最近记录：	7 年，9 月前