我有一个看起来像这样的表:
CREATE TABLE tracks (id SERIAL, artists JSON);
INSERT INTO tracks (id, artists)
VALUES (1, '[{"name": "blink-182"}]');
INSERT INTO tracks (id, artists)
VALUES (2, '[{"name": "The Dirty Heads"}, {"name": "Louis Richards"}]');
Run Code Online (Sandbox Code Playgroud)
还有其他几个与此问题无关的列.将它们存储为JSON是有原因的.
我要做的是查找具有特定艺术家姓名(完全匹配)的曲目.
我正在使用此查询:
SELECT * FROM tracks
WHERE 'ARTIST NAME' IN
(SELECT value->>'name' FROM json_array_elements(artists))
Run Code Online (Sandbox Code Playgroud)
例如
SELECT * FROM tracks
WHERE 'The Dirty Heads' IN
(SELECT value->>'name' FROM json_array_elements(artists))
Run Code Online (Sandbox Code Playgroud)
但是,这会进行全表扫描,并且速度不是很快.我尝试使用函数创建GIN索引names_as_array(artists)并使用'ARTIST NAME' = ANY names_as_array(artists),但是不使用索引并且查询实际上明显更慢.
使用PostgreSQL 8.4.9,我对查询的PostgreSQL性能有一个奇怪的问题.此查询正在选择3D卷中的一组点,使用a LEFT OUTER JOIN添加相关ID列,其中存在相关ID.x范围的微小变化可能导致PostgreSQL选择不同的查询计划,执行时间从0.01秒到50秒.这是有问题的查询:
SELECT treenode.id AS id,
treenode.parent_id AS parentid,
(treenode.location).x AS x,
(treenode.location).y AS y,
(treenode.location).z AS z,
treenode.confidence AS confidence,
treenode.user_id AS user_id,
treenode.radius AS radius,
((treenode.location).z - 50) AS z_diff,
treenode_class_instance.class_instance_id AS skeleton_id
FROM treenode LEFT OUTER JOIN
(treenode_class_instance INNER JOIN
class_instance ON treenode_class_instance.class_instance_id
= class_instance.id
AND class_instance.class_id = 7828307)
ON (treenode_class_instance.treenode_id = treenode.id
AND treenode_class_instance.relation_id = 7828321)
WHERE treenode.project_id = 4
AND (treenode.location).x >= 8000
AND (treenode.location).x <= (8000 + 4736) …Run Code Online (Sandbox Code Playgroud) database postgresql performance sql-execution-plan postgresql-performance
在我的 PostgreSQL 11.11 中,我有一列jsonb保存如下对象:
{
"dynamicFields":[
{
"name":"200",
"hidden":false,
"subfields":[
{
"name":"a",
"value":"Subfield a"
},
{
"name":"b",
"value":"Subfield b"
}
]
}
]
}
Run Code Online (Sandbox Code Playgroud)
dynamicFields是一个数组,subfields也是一个数组,当我点击这样的选择时遇到性能问题:
select *
from my_table a
cross join lateral jsonb_array_elements(jsonb_column -> 'dynamicFields') df
cross join lateral jsonb_array_elements(df -> 'subfields') sf
where df ->> 'name' = '200' and sf ->> 'name' = 'a'
Run Code Online (Sandbox Code Playgroud)
性能问题主要存在于subfield. 我已经添加了这样的索引:
CREATE INDEX idx_my_index ON my_table USING gin ((marc->'dynamicFields') jsonb_path_ops);
Run Code Online (Sandbox Code Playgroud)
如何为subfields内部添加索引dynamicFields?
上面的查询只是一个示例,我在与数据库中其他表的联接中经常使用它。而且我也认识 …
这是explain.depesz.com上的示例计划:
Limit (cost=65301.950..65301.950 rows=1 width=219) (actual time=886.074..886.074 rows=0 loops=1)
-> Sort (cost=65258.840..65301.950 rows=17243 width=219) (actual time=879.683..885.211 rows=17589 loops=1)
Sort Key: juliet.romeo
Sort Method: external merge Disk: 4664kB
-> Hash Join (cost=30177.210..62214.980 rows=17243 width=219) (actual time=278.986..852.834 rows=17589 loops=1)
Hash Cond: (whiskey_quebec.whiskey_five = juliet.quebec)
-> Bitmap Heap Scan on whiskey_quebec (cost=326.060..21967.630 rows=17243 width=4) (actual time=7.494..65.956 rows=17589 loops=1)
Recheck Cond: (golf = 297)
-> Bitmap Index Scan on kilo (cost=0.000..321.750 rows=17243 width=0) (actual time=4.638..4.638 rows=17589 loops=1)
Index Cond: (golf = 297) …Run Code Online (Sandbox Code Playgroud)