我将尝试通过以下示例来解释我的误解。
我不明白基本面的Bitmap Heap Scan Node
。考虑SELECT customerid, username FROM customers WHERE customerid < 1000 AND username <'user100';
计划如下的查询:
Bitmap Heap Scan on customers (cost=25.76..61.62 rows=10 width=13) (actual time=0.077..0.077 rows=2 loops=1)
Recheck Cond: (((username)::text < 'user100'::text) AND (customerid < 1000))
-> BitmapAnd (cost=25.76..25.76 rows=10 width=0) (actual time=0.073..0.073 rows=0 loops=1)
-> Bitmap Index Scan on ix_cust_username (cost=0.00..5.75 rows=200 width=0) (actual time=0.006..0.006 rows=2 loops=1)
Index Cond: ((username)::text < 'user100'::text)
-> Bitmap Index Scan on customers_pkey (cost=0.00..19.75 rows=1000 width=0) (actual …
Run Code Online (Sandbox Code Playgroud) PostgreSQL 9.2
我试图了解Hash Semi Join
和 just之间的区别Hash Join
。
这里有两个查询:
一世
EXPLAIN ANALYZE SELECT * FROM orders WHERE customerid IN (SELECT
customerid FROM customers WHERE state='MD');
Hash Semi Join (cost=740.34..994.61 rows=249 width=30) (actual time=2.684..4.520 rows=120 loops=1)
Hash Cond: (orders.customerid = customers.customerid)
-> Seq Scan on orders (cost=0.00..220.00 rows=12000 width=30) (actual time=0.004..0.743 rows=12000 loops=1)
-> Hash (cost=738.00..738.00 rows=187 width=4) (actual time=2.664..2.664 rows=187 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 7kB
-> Seq Scan on customers (cost=0.00..738.00 rows=187 width=4) (actual …
Run Code Online (Sandbox Code Playgroud) 我在 PostgreSQL 9.4 中有这个表:
CREATE TABLE user_operations(
id SERIAL PRIMARY KEY,
operation_id integer,
user_id integer )
Run Code Online (Sandbox Code Playgroud)
该表由~1000-2000
不同的操作组成,每个操作对应于所有用户80000-120000
集合S
的某个子集(每个子集由大约元素组成):
S = {1, 2, 3, ... , 122655}
Run Code Online (Sandbox Code Playgroud)
参数:
work_mem = 128MB
table_size = 880MB
Run Code Online (Sandbox Code Playgroud)
我也有一个关于operation_id
.
问题:user_id
对于operation_id
集合的重要部分(20%-60%)查询所有不同的最佳计划是什么,例如:
SELECT DISTINCT user_id FROM user_operation WHERE operation_id < 500
Run Code Online (Sandbox Code Playgroud)
可以在表上创建更多索引。目前,查询的计划是:
HashAggregate (cost=196173.56..196347.14 rows=17358 width=4) (actual time=1227.408..1359.947 rows=598336 loops=1)
-> Bitmap Heap Scan on user_operation (cost=46392.24..189978.17 rows=2478155 width=4) (actual time=233.163..611.182 rows=2518122 loops=1)
Recheck Cond: …
Run Code Online (Sandbox Code Playgroud) postgresql performance count distinct postgresql-performance