带有游标优化的SQL查询

Tac*_*han 1 sql t-sql sql-server

我有一个查询,我遍历一个表 - >为每个条目我迭代通过另一个表,然后计算一些结果.我使用游标迭代表.此查询需要很长时间才能完成.总是超过3分钟.如果我在C#中做类似的事情,其中​​表是数组或字典,它甚至不需要一秒钟.我做错了什么,如何提高效率?

DELETE FROM [QueryScores]
GO

INSERT INTO [QueryScores] (Id)
SELECT Id FROM [Documents]

DECLARE @Id NVARCHAR(50)

DECLARE myCursor CURSOR LOCAL FAST_FORWARD FOR
SELECT [Id] FROM [QueryScores]

OPEN myCursor

FETCH NEXT FROM myCursor INTO @Id

WHILE @@FETCH_STATUS = 0
    BEGIN
        DECLARE @Score FLOAT = 0.0

        DECLARE @CounterMax INT = (SELECT COUNT(*) FROM [Query])
        DECLARE @Counter INT = 0

        PRINT 'Document: ' + CAST(@Id AS VARCHAR)
        PRINT 'Score: ' + CAST(@Score AS VARCHAR)

        WHILE @Counter < @CounterMax
            BEGIN

            DECLARE @StemId INT = (SELECT [Query].[StemId] FROM [Query] WHERE [Query].[Id] = @Counter)

            DECLARE @Weight FLOAT = (SELECT [tfidf].[Weight] FROM [TfidfWeights] AS [tfidf] WHERE [tfidf].[StemId] = @StemId AND [tfidf].[DocumentId] = @Id)

            PRINT 'WEIGHT: ' + CAST(@Weight AS VARCHAR)

            IF(@Weight > 0.0)
                BEGIN
                DECLARE @QWeight FLOAT = (SELECT [Query].[Weight] FROM [Query] WHERE [Query].[StemId] = @StemId)
                SET @Score = @Score + (@QWeight * @Weight)
                PRINT 'Score: ' + CAST(@Score AS VARCHAR)
                END

            SET @Counter = @Counter + 1
            END 

        UPDATE [QueryScores] SET Score = @Score WHERE Id = @Id 

        FETCH NEXT FROM myCursor INTO @Id
    END

CLOSE myCursor
DEALLOCATE myCursor 
Run Code Online (Sandbox Code Playgroud)

逻辑是我有一份文档列表.我有一个问题/疑问.我遍历每个文档,然后通过查询术语/单词进行嵌套迭代,以查找文档是否包含这些术语.如果是,那么我添加/乘以预先计算的分数.

Tom*_*m H 7

问题是你正在尝试使用基于集合的语言来迭代像过程语言这样的东西.SQL需要不同的思维方式.你应该几乎从不考虑SQL中的循环.

从我可以从你的代码中收集的内容来看,这应该是你在所有这些循环中尝试做的事情,但是它以基于集合的方式在单个语句中完成,这是SQL擅长的.

INSERT INTO QueryScores (id, score)
SELECT
    D.id,
    SUM(CASE WHEN W.[Weight] > 0 THEN W.[Weight] * Q.[Weight] ELSE NULL END)
FROM
    Documents D
CROSS JOIN Query Q
LEFT OUTER JOIN TfidfWeights W ON W.StemId = Q.StemId AND W.DocumentId = D.id
GROUP BY
    D.id
Run Code Online (Sandbox Code Playgroud)

当然,如果没有对您的要求的描述或具有预期输出的样本数据,我不知道这实际上是否是您想要获得的,但考虑到您的代码,这是我最好的猜测.

您应该阅读:https://stackoverflow.com/help/how-to-ask