我有一个名为PostgreSQL的表queries_query,它有很多列.
其中两个列created和user_sid我的应用程序经常在SQL查询中一起使用,以确定给定用户在过去30天内完成了多少查询.在最近30天之前的任何时间查询这些统计数据是非常非常罕见的.
这是我的问题:
我目前通过运行以下方法在这两列上创建了我的多列索引:
CREATE INDEX CONCURRENTLY some_index_name ON queries_query (user_sid, created)
Run Code Online (Sandbox Code Playgroud)
但我想进一步限制索引只关心创建日期在过去30天内的查询.我尝试过以下方法:
CREATE INDEX CONCURRENTLY some_index_name ON queries_query (user_sid, created)
WHERE created >= NOW() - '30 days'::INTERVAL`
Run Code Online (Sandbox Code Playgroud)
但这引发了一个异常,说明我的函数必须是不可变的.
我很乐意让这个工作,以便我可以优化我的索引,并削减Postgres需要执行这些重复查询的资源.
我有一个vote_pairs看起来像这样的视图:
CREATE VIEW vote_pairs AS
SELECT
v1.name as name1,
v2.name as name2,
...
FROM votes AS v1
JOIN votes AS v2
ON v1.topic_id = v2.topic_id;
Run Code Online (Sandbox Code Playgroud)
并且,如果votes表中有大约100k行,则跨此视图的查询大约需要3秒钟才能执行.
但是,当我在名称上添加额外的过滤器时:
… ON v1.topic_id = v2.topic_id AND v1.name < v2.name;
Run Code Online (Sandbox Code Playgroud)
运行时间翻了四倍,查询vote_pairs完成时间大约需要12秒.
无论限制的位置如何,此运行时都是一致的...例如,如果将过滤器移动到WHERE外部查询的子句,则查询同样很慢:
SELECT * FROM vote_pairs WHERE name1 < name2;
Run Code Online (Sandbox Code Playgroud)
这是怎么回事?Postgres的词典比较速度慢吗?这是别的吗?我怎么能提高这个查询的速度?
投票表:
CREATE TABLE votes (
topic_id INTEGER REFERENCES topics(id),
name VARCHAR(64),
vote VARCHAR(12)
)
CREATE INDEX votes_topic_name ON votes (topic_id, name);
CREATE INDEX votes_name ON …Run Code Online (Sandbox Code Playgroud) 您好任何想法如何加快这个查询?
输入
EXPLAIN SELECT entityid FROM entity e
LEFT JOIN level1entity l1 ON l1.level1id = e.level1_level1id
LEFT JOIN level2entity l2 ON l2.level2id = l1.level2_level2id
WHERE
l2.userid = 'a987c246-65e5-48f6-9d2d-a7bcb6284c8f'
AND
(entityid NOT IN
(1377776,1377792,1377793,1377794,1377795,1377796... 50000 ids)
)
Run Code Online (Sandbox Code Playgroud)
产量
Nested Loop (cost=0.00..1452373.79 rows=3865 width=8)
-> Nested Loop (cost=0.00..8.58 rows=1 width=8)
Join Filter: (l1.level2_level2id = l2.level2id)
-> Seq Scan on level2entity l2 (cost=0.00..3.17 rows=1 width=8)
Filter: ((userid)::text = 'a987c246-65e5-48f6-9d2d-a7bcb6284c8f'::text)
-> Seq Scan on level1entity l1 (cost=0.00..4.07 rows=107 width=16)
-> Index Scan using fk_fk18edb1cfb2a41235_idx …Run Code Online (Sandbox Code Playgroud) 我正在尝试获取表中最近的行.我有一个简单的时间戳created_at索引.当我查询时ORDER BY created_at DESC LIMIT 1,它需要的远远超过我的想象(我的机器上36k行约50ms).
EXPLAIN -ing声称它使用向后索引扫描,但我确认更改索引(created_at DESC)不会改变查询规划器中的简单索引扫描的成本.
如何优化此用例?
运行postgresql 9.2.4.
# EXPLAIN SELECT * FROM articles ORDER BY created_at DESC LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.58 rows=1 width=1752)
-> Index Scan Backward using index_articles_on_created_at on articles (cost=0.00..20667.37 rows=35696 width=1752)
(2 rows)
Run Code Online (Sandbox Code Playgroud) 我有一个问题:
EXPLAIN ANALYZE
SELECT CAST(DATE(associationtime) AS text) AS date ,
cast(SUM(extract(epoch
FROM disassociationtime) - extract(epoch
FROM associationtime)) AS bigint) AS sessionduration,
cast(SUM(tx) AS bigint)AS tx,
cast(SUM(rx) AS bigint) AS rx,
cast(SUM(dataRetries) AS bigint) AS DATA,
cast(SUM(rtsRetries) AS bigint) AS rts,
count(*)
FROM SESSION
WHERE ssid_id=42
AND ap_id=1731
AND DATE(associationtime)>=DATE('Tue Nov 04 00:00:00 MSK 2014')
AND DATE(associationtime)<=DATE('Thu Nov 20 00:00:00 MSK 2014')
GROUP BY(DATE(associationtime))
ORDER BY DATE(associationtime);
Run Code Online (Sandbox Code Playgroud)
输出是:
GroupAggregate (cost=0.44..17710.66 rows=1 width=32) (actual time=4.501..78.880 rows=17 loops=1)
-> Index Scan using session_lim_values_idx …Run Code Online (Sandbox Code Playgroud) postgresql indexing sql-execution-plan postgresql-performance
I have a pretty simple table
CREATE TABLE approved_posts (
project_id INTEGER,
feed_id INTEGER,
post_id INTEGER,
approved_time TIMESTAMP NOT NULL,
post_time TIMESTAMP NOT NULL,
PRIMARY KEY (project_id, feed_id, post_id)
)
Run Code Online (Sandbox Code Playgroud)
And I'm trying to optimize this query:
SELECT *
FROM approved_posts
WHERE feed_id IN (?, ?, ?)
AND project_id = ?
ORDER BY approved_time DESC, post_time DESC
LIMIT 1;
Run Code Online (Sandbox Code Playgroud)
The query optimizer is fetching every single approved_post that matches the predicate, sorting all 100k results, and returning the top one …
我有点困惑,需要一些建议。我用PostgreSQL 11数据库。我有这么简单的sql语句:
SELECT DISTINCT "CITY", "AREA", "REGION"
FROM youtube
WHERE
"CITY" IS NOT NULL
AND
"AREA" IS NOT NULL
AND
"REGION" IS NOT NULL
Run Code Online (Sandbox Code Playgroud)
youtube我在 sql 语句中使用的表有 2500 万条记录。我认为这就是为什么查询需要 15-17 秒才能完成。对于我使用该查询的 Web 项目,它太长了。我正在尝试加快请求。
我为 youtube 表创建了这样的索引:
CREATE INDEX youtube_location_idx ON public.youtube USING btree ("CITY", "AREA", "REGION");
Run Code Online (Sandbox Code Playgroud)
在这一步之后,我再次运行查询,但需要相同的时间才能完成。似乎查询不使用索引。我如何知道查询中是否使用了任何索引?
我正在从 Postgres 表中聚合数据,查询大约需要 2 秒,我想将其减少到不到一秒。
请在下面找到执行细节:
询问
select
a.search_keyword,
hll_cardinality( hll_union_agg(a.users) ):: int as user_count,
hll_cardinality( hll_union_agg(a.sessions) ):: int as session_count,
sum(a.total) as keyword_count
from
rollup_day a
where
a.created_date between '2018-09-01' and '2019-09-30'
and a.tenant_id = '62850a62-19ac-477d-9cd7-837f3d716885'
group by
a.search_keyword
order by
session_count desc
limit 100;
Run Code Online (Sandbox Code Playgroud)
表元数据
查询计划
Custom Scan (cost=0.00..0.00 rows=0 width=0) (actual time=1722.685..1722.694 rows=100 loops=1)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=5454 dbname=postgres
-> Limit (cost=64250.24..64250.49 …Run Code Online (Sandbox Code Playgroud) sql postgresql indexing query-performance postgresql-performance
我有一个语句正在将一堆行(这个问题的内容或位置并不重要)插入到 Postgres 数据库中,但它没有我想要的那么快。我可以运行一个解释查询来查看它在做什么,我得到如下结果:
Insert on dpdb.datapoints (cost=0.00..6917.76 rows=44184 width=1786) (actual time=15558.623..15558.623 rows=0 loops=1)
Buffers: shared hit=34670391 read=98370 dirtied=48658 written=39875
I/O Timings: read=704.525 write=242.915
-> Seq Scan on public.fred (cost=0.00..6917.76 rows=44184 width=1786) (actual time=0.018..197.853 rows=44184 loops=1)
Output: nextval('datapoints_id_seq'::regclass), fred.company_id, fred.tag, ... lots more columns ...
Buffers: shared hit=44186 read=6253 dirtied=1
I/O Timings: read=29.176
Planning time: 0.110 ms
Trigger RI_ConstraintTrigger_c_14845718 for constraint datapoints_tag_source_fkey: time=236.677 calls=44184
Trigger RI_ConstraintTrigger_c_14845723 for constraint datapoints_sheet_type_fkey: time=536.367 calls=44184
Trigger RI_ConstraintTrigger_c_14845728 for constraint datapoints_subcontext_fkey: time=178.200 calls=44184
Trigger RI_ConstraintTrigger_c_14845733 for …Run Code Online (Sandbox Code Playgroud) 我可以采取任何其他步骤来加快查询执行速度吗?
\n我有一个超过 100m 行的表,我需要搜索匹配的字符串。为此,我检查了两个选项:
\n@@(to_tsquery 或 plainto_tsquery)进行比较我发现我可以使用索引来提高性能。\n对于我的 GiST 索引,我尝试siglen从小数字增加到 2024,但由于某种原因 Postgres 使用512而不是更高。
CREATE INDEX trgm_idx_512_gg ON table USING GIST (name gist_trgm_ops(siglen=512));\nRun Code Online (Sandbox Code Playgroud)\n询问:
\nSELECT name, similarity(name, '\xd0\xbd\xd0\xbe\xd1\x83\xd1\x82\xd0\xb1\xd1\x83\xd0\xba MSI GF63 Thin 10SC 086XKR 9S7 16R512 086') as sm\nFROM table\nWHERE name % '\xd0\xbd\xd0\xbe\xd1\x83\xd1\x82\xd0\xb1\xd1\x83\xd0\xba MSI GF63 Thin 10SC 086XKR 9S7 16R512 086' \nRun Code Online (Sandbox Code Playgroud)\nEXPLAIN输出:
Bitmap Heap Scan on …Run Code Online (Sandbox Code Playgroud) postgresql ×10
indexing ×5
sql ×4
performance ×1
pg-trgm ×1
sorting ×1
sql-order-by ×1
timestamp ×1