小编St.*_*rio的帖子

理解“位图堆扫描”和“位图索引扫描”

我将尝试通过以下示例来解释我的误解。

我不明白基本面Bitmap Heap Scan Node。考虑SELECT customerid, username FROM customers WHERE customerid < 1000 AND username <'user100';计划如下的查询:

Bitmap Heap Scan on customers  (cost=25.76..61.62 rows=10 width=13) (actual time=0.077..0.077 rows=2 loops=1)
  Recheck Cond: (((username)::text < 'user100'::text) AND (customerid < 1000))
  ->  BitmapAnd  (cost=25.76..25.76 rows=10 width=0) (actual time=0.073..0.073 rows=0 loops=1)
        ->  Bitmap Index Scan on ix_cust_username  (cost=0.00..5.75 rows=200 width=0) (actual time=0.006..0.006 rows=2 loops=1)
              Index Cond: ((username)::text < 'user100'::text)
        ->  Bitmap Index Scan on customers_pkey  (cost=0.00..19.75 rows=1000 width=0) (actual …
Run Code Online (Sandbox Code Playgroud)

postgresql index

61
推荐指数
2
解决办法
3万
查看次数

哈希联接与哈希半联接

PostgreSQL 9.2

我试图了解Hash Semi Join和 just之间的区别Hash Join

这里有两个查询:

一世

EXPLAIN ANALYZE SELECT * FROM orders WHERE customerid IN (SELECT
customerid FROM customers WHERE state='MD');

Hash Semi Join  (cost=740.34..994.61 rows=249 width=30) (actual time=2.684..4.520 rows=120 loops=1)
  Hash Cond: (orders.customerid = customers.customerid)
  ->  Seq Scan on orders  (cost=0.00..220.00 rows=12000 width=30) (actual time=0.004..0.743 rows=12000 loops=1)
  ->  Hash  (cost=738.00..738.00 rows=187 width=4) (actual time=2.664..2.664 rows=187 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 7kB
        ->  Seq Scan on customers  (cost=0.00..738.00 rows=187 width=4) (actual …
Run Code Online (Sandbox Code Playgroud)

postgresql join hashing

9
推荐指数
1
解决办法
3355
查看次数

多对多关系中不同 ID 的最快查询

我在 PostgreSQL 9.4 中有这个表:

CREATE TABLE user_operations( 
    id SERIAL PRIMARY KEY, 
    operation_id integer, 
    user_id integer )
Run Code Online (Sandbox Code Playgroud)

该表由~1000-2000不同的操作组成,每个操作对应于所有用户80000-120000集合S的某个子集(每个子集由大约元素组成):

S = {1, 2, 3, ... , 122655}
Run Code Online (Sandbox Code Playgroud)

参数:

work_mem = 128MB
table_size = 880MB
Run Code Online (Sandbox Code Playgroud)

我也有一个关于operation_id.

问题:user_id对于operation_id集合的重要部分(20%-60%)查询所有不同的最佳计划是什么,例如:

SELECT DISTINCT user_id FROM user_operation WHERE operation_id < 500
Run Code Online (Sandbox Code Playgroud)

可以在表上创建更多索引。目前,查询的计划是:

HashAggregate  (cost=196173.56..196347.14 rows=17358 width=4) (actual time=1227.408..1359.947 rows=598336 loops=1)
  ->  Bitmap Heap Scan on user_operation  (cost=46392.24..189978.17 rows=2478155 width=4) (actual time=233.163..611.182 rows=2518122 loops=1)
        Recheck Cond: …
Run Code Online (Sandbox Code Playgroud)

postgresql performance count distinct postgresql-performance

6
推荐指数
1
解决办法
1293
查看次数