是否可以索引 ORDER BY 中使用的计算列?

Wig*_*tag 4 mysql index

由于未正确编写查询,我的 mysql 服务器出现问题。我没有使用索引,因为我不知道如何在具有SELECT COUNT(*) AS b ... ORDER BY b.

看起来不可能,所以如果确实不可能,我如何重新管理我的查询?

SELECT COUNT(downloaded.id) AS downloaded_count
    , downloaded.file_name
    ,uploaded.* 
FROM `downloaded` JOIN uploaded 
ON downloaded.file_name = uploaded.file_name 
WHERE downloaded.completed = 1
AND uploaded.active = 1 
AND uploaded.nsfw = 0 
AND downloaded.datetime > DATE_SUB(NOW(), INTERVAL 7 DAY) 
GROUP BY downloaded.file_name 
ORDER BY downloaded_count DESC LIMIT 30;
Run Code Online (Sandbox Code Playgroud)

解释

+----+-------------+------------+------+---------------+-----------+---------+--------------------------+------+----------------------------------------------+
| id | select_type | table      | type | possible_keys | key       | key_len | ref                      | rows | Extra                                        |
+----+-------------+------------+------+---------------+-----------+---------+--------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | uploaded   | ALL  | file_name_up  | NULL      | NULL    | NULL                     | 3139 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | downloaded | ref  | file_name     | file_name | 767     | piqik.uploaded.file_name |    8 | Using where                                  |
+----+-------------+------------+------+---------------+-----------+---------+--------------------------+------+----------------------------------------------+
Run Code Online (Sandbox Code Playgroud)

更新:

与 ORDER BY;

Showing rows 0 - 29 ( 30 total, Query took 0.1639 sec)
Run Code Online (Sandbox Code Playgroud)

没有 ORDER BY;

Showing rows 0 - 29 ( 30 total, Query took 0.0064 sec)
Run Code Online (Sandbox Code Playgroud)

更新 #2:

表:已上传(总计 720.5 KiB)

CREATE TABLE IF NOT EXISTS `uploaded` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `sid` int(1) NOT NULL,
  `file_name` varchar(255) NOT NULL,
  `file_size` varchar(255) NOT NULL,
  `file_ext` varchar(255) NOT NULL,
  `file_name_keyword` varchar(255) NOT NULL,
  `access_key` varchar(40) NOT NULL,
  `upload_datetime` datetime NOT NULL,
  `last_download` datetime NOT NULL,
  `file_password` varchar(255) NOT NULL DEFAULT '',
  `nsfw` int(1) NOT NULL,
  `votes` int(11) NOT NULL,
  `downloads` int(11) NOT NULL,
  `video_thumbnail` int(1) NOT NULL DEFAULT '0',
  `video_duration` varchar(255) NOT NULL DEFAULT '',
  `video_resolution` varchar(11) NOT NULL,
  `video_additional` varchar(255) NOT NULL DEFAULT '',
  `active` int(1) NOT NULL DEFAULT '1',
  PRIMARY KEY (`id`),
  FULLTEXT KEY `file_name_keyword` (`file_name_keyword`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=3328 ;
Run Code Online (Sandbox Code Playgroud)

表:已下载(总计 5,152.0 KiB)

CREATE TABLE IF NOT EXISTS `downloaded` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `file_name` varchar(255) NOT NULL,
  `completed` int(1) NOT NULL,
  `client_ip_addr` varchar(40) NOT NULL,
  `client_access_key` varchar(40) NOT NULL,
  `datetime` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=31475 ;
Run Code Online (Sandbox Code Playgroud)

ype*_*eᵀᴹ 5

您可以通过多种方式提高查询效率。

  • 首先,索引表。好的,这样做并不容易。并且应该考虑与您在数据库中运行的所有查询相关的任何索引。让我们假设这是唯一的查询或者它是最关键的查询。

    • 我们将把这个任务分解成涉及的 2 个表。第一个表 ,uploaded出现 - 除了连接 - 仅在选择列表(其所有列)和 2 个WHERE条件中出现:

      AND uploaded.active = 1 
      AND uploaded.nsfw = 0 
      
      Run Code Online (Sandbox Code Playgroud)

      两个简单的相等条件,所以直接的想法是在这两列上建立索引,或者更好地包括连接列(这是一个技术问题,因为该表使用的是 MyISAM 引擎): (active, nsfw, file_name)

      ALTER TABLE uploaded
        ADD INDEX active_nsfw_fname_IX
          (active, nsfw, filename) ;
      
      Run Code Online (Sandbox Code Playgroud)
    • 然后我们去downloaded桌边。这个比较复杂。它的列不会出现在选择列表中(只有聚合结果,计数),但它们用于WHERE,GROUP BY子句和ORDER BY / LIMIT. 更复杂的是,条件之一是范围条件 ( >) 而不是相等:

      WHERE downloaded.completed = 1
        AND downloaded.datetime > DATE_SUB(NOW(), INTERVAL 7 DAY) 
      GROUP BY downloaded.file_name 
      ORDER BY downloaded_count DESC LIMIT 30 
      
      Run Code Online (Sandbox Code Playgroud)

      如果两者都是相等的,我会盲目地添加一个索引 on(completed, datetime, file_name)但在这种情况下,我会首先尝试一个索引 on (completed, file_name, datetime),即首先来自相等检查的列,然后是分组中的列,最后是另一个。

      ALTER TABLE downloaded
        ADD INDEX comp_fname_dt_IX
          (completed, file_name, datetime) ;
      
      Run Code Online (Sandbox Code Playgroud)
  • 在具有ORDER BY并且LIMIT首先在派生表中执行此操作然后连接其他表的查询中,通常可能的另一个改进。我们不能在这里完全做到这一点,但我们可以尝试将其放入派生表中。
    请注意,从uploaded表中仅active_nsfw_fname_IX使用索引中的列并且对于该downloaded表,comp_fname_dt_IX索引的列与查询(逻辑上)执行的顺序对齐(WHERE - GROUP BY - SELECT):

    ( SELECT COUNT(CASE WHEN d.datetime > DATE_SUB(NOW(), INTERVAL 7 DAY) THEN 1 END)
                 AS downloaded_count
           , d.file_name
      FROM downloaded AS d JOIN uploaded AS u 
        ON d.file_name = u.file_name 
      WHERE d.completed = 1
        AND u.active = 1 
        AND u.nsfw = 0 
      GROUP BY d.file_name 
      ORDER BY downloaded_count DESC LIMIT 30
    )
    
    Run Code Online (Sandbox Code Playgroud)

查询最终变为:

SELECT dc.downloaded_count
     , dc.file_name                 -- you can remove this column from the results
     , up.*                         -- as uploaded has a file_name column
FROM 
    ( SELECT COUNT(CASE WHEN d.datetime > DATE_SUB(NOW(), INTERVAL 7 DAY) THEN 1 END)
                 AS downloaded_count
           , d.file_name
      FROM downloaded AS d JOIN uploaded AS u 
        ON d.file_name = u.file_name 
      WHERE d.completed = 1
        AND u.active = 1 
        AND u.nsfw = 0 
      GROUP BY d.file_name 
      ORDER BY downloaded_count DESC LIMIT 30
    ) AS dc
    JOIN uploaded AS up
      ON dc.file_name = up.file_name 
ORDER BY downloaded_count DESC ;                 -- no need for LIMIT here
Run Code Online (Sandbox Code Playgroud)

测试时间!我建议您添加这 2 个索引,然后对查询进行计时,包括原始索引和上面的索引。


并且要回答您的问题,不能对聚合列进行索引。MySQL 既没有计算列也没有物化视图,这在这种情况下是需要的(只有 MariaDB 实现了持久的计算列,但这对您在这个查询中没有帮助。)

因此,唯一的方法是尝试通过索引和重写查询来尽可能地减少将要排序的行数 - 这就是我在上面基本上试图完成的。

您仍然会在 中看到 a "filesort"EXPLAIN但对 100 行和 10k 行进行排序是完全不同的(在性能方面)。