我有一个查询在两个数据集之间返回的时间差异很大.对于一组(数据库A),它会在几秒钟内返回,而另一组(数据库B)......我还没有等待足够长的时间,但是超过10分钟.我已将这两个数据库转储到本地计算机,我可以重现运行MySQL 5.1.37的问题.
奇怪的是,数据库B小于数据库A.
重现问题的查询的精简版本是:
SELECT * FROM po_shipment ps
JOIN po_shipment_item psi USING (ship_id)
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0'
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1'
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2'
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3' …Run Code Online (Sandbox Code Playgroud) 我有一个问题
SELECT foo FROM bar WHERE some_column = ?
Run Code Online (Sandbox Code Playgroud)
我可以从MySQL获得解释计划而不填写参数值吗?
只是要求对2.之间的区别进行澄清2.据我所知,EXPLAIN PLAN为您提供理论执行计划,而DBMS_XPLAN.DISPLAY_CURSOR为您提供了具有该语句执行统计信息的实际执行计划.
EXPLAIN PLAN将此数据存储在PLAN_TABLE中,而DBMS_XPLAN使用V $ SQL_PLAN,V $ SQL_PLAN_STATISTICS和V $ SQL_PLAN_STATISTICS_ALL视图获取其信息.
但是,要使DISPLAY_CURSOR收集该参数的实际运行时统计信息,需要设置/*+ gather_plan_statistics */提示.否则,只填充V $ SQL_PLAN,它只会为您提供执行计划,但不会为您提供实际的执行统计信息.只有/*+ gather_plan_statistics */在填充V $ SQL_PLAN_STATISTICS 的地方.
所以我的问题是,如果我不使用gather_plan_statistics提示,EXPLAIN PLAN和DISPLAY_CURSOR会不会给我相同的执行计划(对于同一个语句)?
我使用Microsoft SQL Server 2008(SP1,x64).我有两个相同的查询,或者我认为,但它们具有完全不同的查询计划和性能.
查询1:
SELECT c_pk
FROM table_c
WHERE c_b_id IN (SELECT b_id FROM table_b WHERE b_z = 1)
OR c_a_id IN (SELECT a_id FROM table_a WHERE a_z = 1)
Run Code Online (Sandbox Code Playgroud)
查询2:
SELECT c_pk
FROM table_c
LEFT JOIN (SELECT b_id FROM table_b WHERE b_z = 1) AS b ON c_b_id = b_id
LEFT JOIN (SELECT a_id FROM table_a WHERE a_z = 1) AS a ON c_a_id = a_id
WHERE b_id IS NOT NULL
OR a_id IS NOT NULL
Run Code Online (Sandbox Code Playgroud)
查询1比我预期的要快,而查询2非常慢.该查询计划 …
sql-server sql-server-2008 database-performance sql-execution-plan
我有以下查询
SELECT translation.id
FROM "TRANSLATION" translation
INNER JOIN "UNIT" unit
ON translation.fk_id_unit = unit.id
INNER JOIN "DOCUMENT" document
ON unit.fk_id_document = document.id
WHERE document.fk_id_job = 3665
ORDER BY translation.id asc
LIMIT 50
Run Code Online (Sandbox Code Playgroud)
它运行了可怕的110秒.
表格大小:
+----------------+-------------+
| Table | Records |
+----------------+-------------+
| TRANSLATION | 6,906,679 |
| UNIT | 6,906,679 |
| DOCUMENT | 42,321 |
+----------------+-------------+
Run Code Online (Sandbox Code Playgroud)
但是,当我将LIMIT参数从50 更改为1000时,查询将在2秒内完成.
这是慢速查询计划
Limit (cost=0.00..146071.52 rows=50 width=8) (actual time=111916.180..111917.626 rows=50 loops=1)
-> Nested Loop (cost=0.00..50748166.14 rows=17371 width=8) …Run Code Online (Sandbox Code Playgroud) 我有这个PostgreSQL 9.4查询运行速度非常快(~12ms):
SELECT
auth_web_events.id,
auth_web_events.time_stamp,
auth_web_events.description,
auth_web_events.origin,
auth_user.email,
customers.name,
auth_web_events.client_ip
FROM
public.auth_web_events,
public.auth_user,
public.customers
WHERE
auth_web_events.user_id_fk = auth_user.id AND
auth_user.customer_id_fk = customers.id AND
auth_web_events.user_id_fk = 2
ORDER BY
auth_web_events.id DESC;
Run Code Online (Sandbox Code Playgroud)
但是,如果我将它嵌入到一个函数中,查询在所有数据中运行速度非常慢,似乎是在运行每条记录,我缺少什么?,我有〜1M的数据,我想简化我的数据库层存储大型查询进入功能和观点.
CREATE OR REPLACE FUNCTION get_web_events_by_userid(int) RETURNS TABLE(
id int,
time_stamp timestamp with time zone,
description text,
origin text,
userlogin text,
customer text,
client_ip inet
) AS
$func$
SELECT
auth_web_events.id,
auth_web_events.time_stamp,
auth_web_events.description,
auth_web_events.origin,
auth_user.email AS user,
customers.name AS customer,
auth_web_events.client_ip
FROM
public.auth_web_events,
public.auth_user,
public.customers
WHERE
auth_web_events.user_id_fk = auth_user.id AND
auth_user.customer_id_fk …Run Code Online (Sandbox Code Playgroud) postgresql function sql-execution-plan postgresql-performance
我一直认为不存在是不是存在的方式,而不是使用不处于条件状态.但是,我对我一直在使用的查询进行比较,我注意到Not In条件的执行实际上似乎更快.任何有关为什么会出现这种情况的见解,或者如果我在此之前做出一个可怕的假设,我将不胜感激!
问题1:
SELECT DISTINCT
a.SFAccountID, a.SLXID, a.Name FROM [dbo].[Salesforce_Accounts] a WITH(NOLOCK)
JOIN _SLX_AccountChannel b WITH(NOLOCK)
ON a.SLXID = b.ACCOUNTID
JOIN [dbo].[Salesforce_Contacts] c WITH(NOLOCK)
ON a.SFAccountID = c.SFAccountID
WHERE b.STATUS IN ('Active','Customer', 'Current')
AND c.Primary__C = 0
AND NOT EXISTS
(
SELECT 1 FROM [dbo].[Salesforce_Contacts] c2 WITH(NOLOCK)
WHERE a.SFAccountID = c2.SFAccountID
AND c2.Primary__c = 1
);
Run Code Online (Sandbox Code Playgroud)
问题2:
SELECT
DISTINCT
a.SFAccountID FROM [dbo].[Salesforce_Accounts] a WITH(NOLOCK)
JOIN _SLX_AccountChannel b WITH(NOLOCK)
ON a.SLXID = b.ACCOUNTID
JOIN [dbo].[Salesforce_Contacts] c WITH(NOLOCK)
ON a.SFAccountID = c.SFAccountID …Run Code Online (Sandbox Code Playgroud) 我遇到了一些奇怪的东西,我无法解释.
我正在使用以下查询:
MERGE INTO Main_Table t
USING Stg_Table s
ON(s.site_id = t.site_id)
WHEN MATCHED THEN
UPDATE SET t.arpu_prev_period = s.arpu_prev_period
.... --50 more columns
where t.period_code = 201612
Run Code Online (Sandbox Code Playgroud)
Stg_Table:索引(Site_Id)
Main_Table:
- 索引(Period_code,Site_id)
- 由period_code分区
- 注 - 我尝试在Site_Id单独的,相同的执行计划上添加索引.
我期望一个使用单分区扫描的执行计划,但我得到了Partition list all.
这是执行计划:
6 | 0 | MERGE STATEMENT | |
7 | 1 | MERGE | Main_Table |
8 | 2 | VIEW | |
9 | 3 | HASH JOIN | |
10 | 4 …Run Code Online (Sandbox Code Playgroud) 我想在Postgres中利用仅索引扫描的强大功能,并尝试使用一个表:
CREATE TABLE dest.contexts
(
id integer NOT NULL,
phrase_id integer NOT NULL,
lang character varying(5) NOT NULL,
ranking_value double precision,
index_min integer,
index_max integer,
case_sensitive boolean,
is_enabled boolean,
is_to_sync boolean NOT NULL DEFAULT true
);
insert into dest.contexts select * from source.contexts;
alter table dest.contexts
add constraint pk_contexts primary key (id, phrase_id, lang);
CREATE INDEX idx_contexts_
ON dest.contexts
USING btree
(id, is_enabled, lang, phrase_id, ranking_value, index_min, index_max, case_sensitive);
Run Code Online (Sandbox Code Playgroud)
索引涵盖了我想在下一个查询中使用的所有列:
explain analyze
select ranking_value, index_min, index_max, case_sensitive
from dest.contexts
where …Run Code Online (Sandbox Code Playgroud) postgresql vacuum sql-execution-plan postgresql-9.4 autovacuum
请解答,非常感谢。
Q1:为什么查询条件是a.id = b.id,但开头只扫描了a.id的索引?但循环次数这么大?
Q2:解释中的“Nested Loop”节点是做什么的?
happydb=# EXPLAIN (ANALYZE,VERBOSE) SELECT b.name FROM a,b WHERE a.id = b.id AND b.id < 10000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
-------------------
Gather (cost=1000.57..174222.54 rows=18002 width=13) (actual time=5.881..3276.311 rows=19998 loops=1)
Output: b.name
Workers Planned: 5
Workers Launched: 5
-> Nested Loop (cost=0.56..171422.34 rows=3600 width=13) (actual time=3.189..3258.998 rows=3333 loops=6)
Output: b.name
Worker 0: actual time=2.591..3259.895 rows=1850 loops=1
Worker 1: actual time=0.180..3251.631 rows=4081 loops=1
Worker 2: actual time=1.344..3261.433 rows=555 loops=1
Worker 3: actual time=8.603..3262.411 rows=3330 loops=1
Worker 4: actual …Run Code Online (Sandbox Code Playgroud) postgresql ×4
sql ×4
mysql ×2
oracle ×2
performance ×2
sql-server ×2
autovacuum ×1
exists ×1
function ×1
optimization ×1
oracle11g ×1
t-sql ×1
vacuum ×1