Guy*_*Guy 5 mysql performance database-design
如何优化以下 SELECT WHERE IN 情况?
我有一个包含超过 1 亿行且只有 3 列的表。主键 (col1) 是 127 varchar。我正在执行 SELECT col1 WHERE col1 IN (...),其中 IN 子句是 5,000 个字符串。我只是想看看数据库中 5,000 个字符串中的哪一个作为主键。
对于专用服务器和 InnoDB 表,查询需要 3 到 10 秒,这是不可接受的。我不认为 1 亿多行对于 MySQL 来说应该不会太困难,即使选择 5k 行,但也许我错了?
可以做些什么来优化这个?我读过一些关于 FULLTEXT 键的内容 - 因为键是 127 varchar,所以这些会更好吗?或者某种类型的 JOIN 或 UNION 是否会比大型 IN 子句加速此查询?
任何帮助,将不胜感激!谢谢!
- - 编辑 - -
SHOW ENGINE INNODB STATUS;
| InnoDB | |
=====================================
2014-07-14 10:59:19 2bf5cf25700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 5 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 6664 srv_active, 0 srv_shutdown, 142740 srv_idle
srv_master_thread log flush and writes: 149372
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 417120
OS WAIT ARRAY INFO: signal count 449454
Mutex spin waits 323558, rounds 2089912, OS waits 48403
RW-shared spins 49101, rounds 462555, OS waits 12976
RW-excl spins 406820, rounds 11261153, OS waits 350839
Spin rounds per wait: 6.46 mutex, 9.42 RW-shared, 27.68 RW-excl
------------
TRANSACTIONS
------------
Trx id counter 21503
Purge done for trx's n:o < 21472 undo n:o < 0 state: running but idle
History list length 641
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 15895, OS thread handle 0x2bf5cf25700, query id 399305 localhost root init
SHOW ENGINE INNODB STATUS
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] ,
ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
1203330 OS file reads, 2141172 OS file writes, 78570 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 5, seg size 7, 0 merges
merged operations:
insert 0, delete mark 0, delete 0
discarded operations:
insert 0, delete mark 0, delete 0
Hash table size 276671, node heap has 52 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 117811837521
Log flushed up to 117811837521
Pages flushed up to 117811837521
Last checkpoint at 117811837521
0 pending log writes, 0 pending chkp writes
21279 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 137363456; in additional pool allocated 0
Dictionary memory allocated 720503
Buffer pool size 8191
Free buffers 1024
Database pages 7115
Old database pages 2606
Modified db pages 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 4741, not young 1045985243
0.00 youngs/s, 0.00 non-youngs/s
Pages read 1202370, created 1138775, written 2068616
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 7115, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Main thread process no. 23063, id 3020409116416, state: sleeping
Number of rows inserted 113995729, updated 144489445, deleted 0, read 171054938
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
Run Code Online (Sandbox Code Playgroud)
以下是一个长镜头,因为我们对您的硬件、InnoDB 配置和查询细节一无所知,但我敢打赌您使用了错误的工具来完成这项工作(InnoDB 引擎)。
您想要实现的是创建一个非常重的索引(最多 127 个字符,这可能需要 - 这是一个大致的近似值 - 每个条目 127*3 字节),这是使用 InnoDB 唯一可用的方法创建的,B+树。此外,由于行围绕主键聚集,因此整行实际上都在索引上,访问主键意味着访问包含整行内容的页面。
简而言之,您有一个唯一索引,其中包含整个表,并且应该或多或少地适合内存(不一定是全部,但在这种情况下,您的工作集似乎是大部分行)。你的InnoDB缓冲池有多大?您的缓冲池命中率如何?您可以使用 检查这两个参数SHOW ENGINE INNODB STATUS
。我敢打赌,您的缓冲池太小,甚至您没有足够的物理内存来保存您的工作集。在这两种情况下,这可能会迫使 InnoDB 为每个查询执行 IOPS。您可能认为您不需要缓存所有内容才能正常工作,您应该是对的。但对于您的特定工作负载(大型 PK),InnoDB 并不是最好的引擎。其他 RDBMS 和 MySQL 引擎中可用的哈希索引可能会更小、更快,但 InnoDB 不支持它。此外,IN + list of values
拥有大量行可能不是最佳的查询方式(在 MySQL 级别),但它肯定比单独执行查询要快。
EXPLAIN
)正在使用range
JOIN 类型,并且不执行全表扫描。