rae*_*dor 3 performance sql-server execution-plan string-splitting query-performance
我似乎在对 aSELECT IN
和 a使用硬编码值之间存在巨大的性能差距STRING_SPLIT
。除了最后一个阶段为STRING_SPLIT
代码多次执行索引查找之外,查询计划是相同的。结果是大约 90000 与大约 15000(根据dm_exec_query_stats
)的 CPU 时间,因此差异是巨大的。我已经在这里发布了两个计划......
有趣的是查询计划显示的成本几乎相同,但是当我检查dm_exec_query_stats
成本 ( last_worker_time
)时却大不相同。
这是查询计划的 2 个输出...
0x79DEAD79D1F149CD 16199
select *
from fn_get_samples(1) s
where s.sample_id in
(2495,2496,2497,2498,2499,2500,2501,2502,2503,2504)
0x4A073840486B252C 86689
select *
from fn_get_samples(1) s
where s.sample_id in
(select value as id
from
STRING_SPLIT('2495,2496,2497,2498,2499,2500,2501,2502,2503,2504',','))
Run Code Online (Sandbox Code Playgroud)
功能代码是...
CREATE FUNCTION [dbo].[fn_get_samples]
(
@user_id int
)
RETURNS TABLE
AS
RETURN (
-- get samples
select s.sample_id,language_id,native_language_id,s.source_sentence,s.markup_sentence,s.latin_sentence,
s.translation_source_sentence,s.translation_markup_sentence,s.translation_latin_sentence,
isnull(sample_vkl.knowledge_level_id,1) as vocab_knowledge_level_id,
isnull(sample_gkl.grammar_knowledge_level_id,0) as grammar_knowledge_level_id,
s.polite_level_id,
case when isnull(tr1.leitner_deck_index,0)=0 then 0 else cast((tr1.leitner_deck_index-1) as float)/cast((max_leitner_deck_index-1) as float) end as progress_percentage,
case when isnull(tr2.leitner_deck_index,0)=0 then 0 else cast((tr2.leitner_deck_index-1) as float)/cast((max_leitner_deck_index-1) as float) end as listening_progress_percentage,
case when f.object_id is null then 0 else 1 end as is_favorite,
case when st.object_id is null then 0 else 1 end as is_studied,
s.has_error,
s.is_deleted,
f.create_datetime as favorite_datetime,
st.create_datetime as studied_datetime,
s.create_user_id,
s.create_datetime,
isnull(s.modify_user_id,s.create_user_id) as modify_user_id,
isnull(s.modify_datetime,s.create_datetime) as modify_datetime,
s.display_order
from samples s
left outer join sample_knowledge_level_votes klv on klv.sample_id=s.sample_id and klv.user_id=@user_id
left outer join favorites f on f.user_id=@user_id and f.object_type_id=(select object_type_id from object_types ot where ot.object_type_name='Pattern Sample') and f.object_id=s.sample_id
left outer join studied st on st.user_id=@user_id and st.object_type_id=(select object_type_id from object_types ot where ot.object_type_name='Pattern Sample') and st.object_id=s.sample_id
left outer join leitner_tracking tr1 on tr1.user_id=@user_id and tr1.object_type_id=(select object_type_id from object_types ot where ot.object_type_name='Pattern Sample') and tr1.object_id=s.sample_id and tr1.skill_type_id=(select skill_type_id from skill_types where skill_type_name=N'Guess Pronunciation from Meaning')
left outer join leitner_tracking tr2 on tr2.user_id=@user_id and tr2.object_type_id=(select object_type_id from object_types ot where ot.object_type_name='Pattern Sample') and tr2.object_id=s.sample_id and tr2.skill_type_id=(select skill_type_id from skill_types where skill_type_name=N'Guess Meaning from Pronunciation')
cross join (select max(leitner_deck_index) as max_leitner_deck_index from leitner_decks) dm
left outer join vw_sample_user_grammar_kl sample_gkl on sample_gkl.user_id=@user_id and sample_gkl.sample_id=s.sample_id
left outer join vw_sample_avg_kl sample_vkl on sample_vkl.sample_id=s.sample_id
where is_deleted=0
)
Run Code Online (Sandbox Code Playgroud)
这似乎与“vw_sample_avg_kl”连接有关。如果我注释掉该连接和 'vocab_knowledge_level_id' 的计算列,那么这两个查询时间将变得非常相似。我把它加回来后,它们就大不相同了。这是该视图的代码...
CREATE VIEW [dbo].[vw_sample_avg_kl]
AS
select sample_id,knowledge_level_id from (
select sample_id,knowledge_level_id,count(*) as frequency,RANK() over (partition by sample_id order by count(*) desc,knowledge_level_id) as myrank
from sample_knowledge_level_votes
group by sample_id,knowledge_level_id
) sample_kl_ranking
where myrank=1
Run Code Online (Sandbox Code Playgroud)
该id
字段是INT
。显示时我的两个查询如下所示dm_exec_query_stats
......
0x4A073840486B252C 41096 select * from fn_get_samples(@user_id) s where s.sample_id in (select * from STRING_SPLIT(@sample_id_list,','))
0x79DEAD79D1F149CD 7849 select * from fn_get_samples(1) s where s.sample_id in (2495,2496,2497,2498,2499,2500,2501,2502,2503,2504)
Run Code Online (Sandbox Code Playgroud)
(与上面的示例数据集略有不同,因此时间略有不同,但您可以看到性能上的巨大差距)
硬编码IN
列表和由其生成的值之间的最大区别STRING_SPLIT
是:
IN
在编译时自动对文字列表中的重复项进行排序和删除。这为优化器提供了有关值的数量和分布的准确信息。STRING_SPLIT
函数value
列的返回类型是varchar
或nvarchar
取决于输入参数。字符串value
列的长度与输入字符串相同。IN
如有必要,在编译时将文字值强制转换为兼容类型。STRING_SPLIT
函数返回的行数是猜测的,值的分布是未知的。简而言之,使用文字IN
列表为优化器提供了更好的信息。
归档时间: |
|
查看次数: |
719 次 |
最近记录: |