小编St.*_*rio的帖子

理解“位图堆扫描”和“位图索引扫描”

我将尝试通过以下示例来解释我的误解。

我不明白基本面的Bitmap Heap Scan Node。考虑SELECT customerid, username FROM customers WHERE customerid < 1000 AND username <'user100';计划如下的查询：

Bitmap Heap Scan on customers  (cost=25.76..61.62 rows=10 width=13) (actual time=0.077..0.077 rows=2 loops=1)
  Recheck Cond: (((username)::text < 'user100'::text) AND (customerid < 1000))
  ->  BitmapAnd  (cost=25.76..25.76 rows=10 width=0) (actual time=0.073..0.073 rows=0 loops=1)
        ->  Bitmap Index Scan on ix_cust_username  (cost=0.00..5.75 rows=200 width=0) (actual time=0.006..0.006 rows=2 loops=1)
              Index Cond: ((username)::text < 'user100'::text)
        ->  Bitmap Index Scan on customers_pkey  (cost=0.00..19.75 rows=1000 width=0) (actual …

Run Code Online (Sandbox Code Playgroud)

postgresql index

St.*_*rio

2015 10-28

61
推荐指数

2
解决办法

3万
查看次数

哈希联接与哈希半联接

PostgreSQL 9.2

我试图了解Hash Semi Join和 just之间的区别Hash Join。

这里有两个查询：

一世

EXPLAIN ANALYZE SELECT * FROM orders WHERE customerid IN (SELECT
customerid FROM customers WHERE state='MD');

Hash Semi Join  (cost=740.34..994.61 rows=249 width=30) (actual time=2.684..4.520 rows=120 loops=1)
  Hash Cond: (orders.customerid = customers.customerid)
  ->  Seq Scan on orders  (cost=0.00..220.00 rows=12000 width=30) (actual time=0.004..0.743 rows=12000 loops=1)
  ->  Hash  (cost=738.00..738.00 rows=187 width=4) (actual time=2.664..2.664 rows=187 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 7kB
        ->  Seq Scan on customers  (cost=0.00..738.00 rows=187 width=4) (actual …

Run Code Online (Sandbox Code Playgroud)

postgresql join hashing

St.*_*rio

lucky-day

9
推荐指数

1
解决办法

3355
查看次数

多对多关系中不同 ID 的最快查询

我在 PostgreSQL 9.4 中有这个表：

CREATE TABLE user_operations( 
    id SERIAL PRIMARY KEY, 
    operation_id integer, 
    user_id integer )

Run Code Online (Sandbox Code Playgroud)

该表由~1000-2000不同的操作组成，每个操作对应于所有用户80000-120000集合S的某个子集（每个子集由大约元素组成）：

S = {1, 2, 3, ... , 122655}

Run Code Online (Sandbox Code Playgroud)

参数：

work_mem = 128MB
table_size = 880MB

Run Code Online (Sandbox Code Playgroud)

我也有一个关于operation_id.

问题：user_id对于operation_id集合的重要部分（20％-60％）查询所有不同的最佳计划是什么，例如：

SELECT DISTINCT user_id FROM user_operation WHERE operation_id < 500

Run Code Online (Sandbox Code Playgroud)

可以在表上创建更多索引。目前，查询的计划是：

HashAggregate  (cost=196173.56..196347.14 rows=17358 width=4) (actual time=1227.408..1359.947 rows=598336 loops=1)
  ->  Bitmap Heap Scan on user_operation  (cost=46392.24..189978.17 rows=2478155 width=4) (actual time=233.163..611.182 rows=2518122 loops=1)
        Recheck Cond: …

Run Code Online (Sandbox Code Playgroud)

postgresql performance count distinct postgresql-performance

St.*_*rio

2020 01-08

6
推荐指数

1
解决办法

1293
查看次数