使用大型WHERE子句进行查询会导致EF6中的npgsql超时异常

Sas*_*ssa 5 c# entity-framework npgsql entity-framework-6

我有一个看起来像这样的查询:

private static IQueryable<MultiframeModule> WhereAllFramesProperties(this IQueryable<MultiframeModule> query, ICollection<Frame> frames)
{
    return frames.Aggregate(query, (q, frame) =>
    {
        return q.Where(p => p.Frames.Any(i => i.FrameData.ShaHash == frame.FrameData.ShaHash));
    });
}
Run Code Online (Sandbox Code Playgroud)

MultiframeModule并且Frame有多对多的关系.

使用该查询,我想找到一个MultiframeModule包含frames我作为参数发送的集合中的所有帧,为此我检查ShaHash参数.

如果frames包含2个帧,那么生成的SQL将是这样的:

SELECT
   "Extent1"."MultiframeModuleId",
   "Extent1"."FrameIncrementPointer",
   "Extent1"."PageNumberVector" 
FROM
   "public"."MultiframeModule" AS "Extent1" 
WHERE
   EXISTS 
   (
      SELECT
         1 AS "C1" 
      FROM
         "public"."Frame" AS "Extent2" 
         INNER JOIN
            "public"."FrameData" AS "Extent3" 
            ON "Extent2"."FrameData_FrameDataId" = "Extent3"."FrameDataId" 
      WHERE
         "Extent1"."MultiframeModuleId" = "Extent2"."MultiframeModule_MultiframeModuleId" 
         AND "Extent3"."ShaHash" = @p__linq__0
   )
   AND EXISTS 
   (
      SELECT
         1 AS "C1" 
      FROM
         "public"."Frame" AS "Extent4" 
         INNER JOIN
            "public"."FrameData" AS "Extent5" 
            ON "Extent4"."FrameData_FrameDataId" = "Extent5"."FrameDataId" 
      WHERE
         "Extent1"."MultiframeModuleId" = "Extent4"."MultiframeModule_MultiframeModuleId" 
         AND "Extent5"."ShaHash" = @p__linq__1
   )
   LIMIT 2

-- p__linq__0: '0' (Type = Int32, IsNullable = false)

-- p__linq__1: '0' (Type = Int32, IsNullable = false)
Run Code Online (Sandbox Code Playgroud)

但是,如果我有更多帧,例如200,那么调用将抛出异常:

Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
Run Code Online (Sandbox Code Playgroud)

使用堆栈跟踪:

   at Npgsql.ReadBuffer.<Ensure>d__27.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Npgsql.NpgsqlConnector.<DoReadMessage>d__157.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ValueTaskAwaiter`1.GetResult()
   at Npgsql.NpgsqlConnector.<ReadMessage>d__156.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ValueTaskAwaiter`1.GetResult()
   at Npgsql.NpgsqlConnector.<ReadExpecting>d__163`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ValueTaskAwaiter`1.GetResult()
   at Npgsql.NpgsqlDataReader.<NextResult>d__32.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Npgsql.NpgsqlDataReader.NextResult()
   at Npgsql.NpgsqlCommand.<Execute>d__71.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ValueTaskAwaiter`1.GetResult()
   at Npgsql.NpgsqlCommand.<ExecuteDbDataReader>d__92.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ValueTaskAwaiter`1.GetResult()
   at Npgsql.NpgsqlCommand.ExecuteDbDataReader(CommandBehavior behavior)
   at System.Data.Entity.Infrastructure.Interception.InternalDispatcher`1.Dispatch[TTarget,TInterceptionContext,TResult](TTarget target, Func`3 operation, TInterceptionContext interceptionContext, Action`3 executing, Action`3 executed)
   at System.Data.Entity.Infrastructure.Interception.DbCommandDispatcher.Reader(DbCommand command, DbCommandInterceptionContext interceptionContext)
   at System.Data.Entity.Core.EntityClient.Internal.EntityCommandDefinition.ExecuteStoreCommands(EntityCommand entityCommand, CommandBehavior behavior)
Run Code Online (Sandbox Code Playgroud)

那么,我的查询失败有一些明显的原因吗?我怎样才能改进它以便能够成功地进行查询?

Iva*_*oev 3

据我所知,问题是由生成的 SQL 查询中的子查询过多引起的。

在我的测试环境中,SqlServer(LocalDB)只是拒绝执行生成的查询,原因是太复杂。PostgreSQL 能够CommandTimeout在约 4 分钟内执行它(将 设为 0 后)。

解决方案是找到不会生成许多子查询的等效构造。在这种情况下,我通常使用什么来计算不同的匹配并将其与标准计数方法进行比较。

它可以通过两种方式实现。

(1) 这仅适用于 类型的条件property == valueN。在这种情况下,可以像这样计算不同的匹配(以伪代码):

obj.Collection
   .Select(elem => elem.Property)
   .Distinct()
   .Count(value => values.Contains(values))
Run Code Online (Sandbox Code Playgroud)

将其应用于您的样本:

private static IQueryable<MultiframeModule> WhereAllFramesProperties(this IQueryable<MultiframeModule> query, ICollection<Frame> frames)
{
    var values = frames.Select(e => e.FrameData.ShaHash);
    var count = frames.Count();
    return query.Where(p => p.Frames.Select(e => e.FrameData.ShaHash)
        .Distinct().Count(v => values.Contains(v)) == count);
}
Run Code Online (Sandbox Code Playgroud)

(2) 这适用于任何类型的条件。在这种情况下,匹配由其索引来标识,这需要动态构建一个选择器表达式,如下所示:

Condition0 ? 0 : Condition1 ? 1 : ... ConditionN-1 ? N - 1 : -1
Run Code Online (Sandbox Code Playgroud)

不同的匹配数是

obj.Collection
   .Select(selector)
   .Distinct()
   .Count(i => i >= 0)
Run Code Online (Sandbox Code Playgroud)

将其应用于您的样本:

private static IQueryable<MultiframeModule> WhereAllFramesProperties(this IQueryable<MultiframeModule> query, ICollection<Frame> frames)
{
    var parameter = Expression.Parameter(typeof(MultiframeModuleFrame), "e");
    var body = frames.Select((frame, index) =>
    {
        Expression<Func<Frame, bool>> predicate = e => e.FrameData.ShaHash == frame.FrameData.ShaHash;
        return new
        {
            Condition = predicate.Body.ReplaceParameter(predicate.Parameters[0], parameter),
            Value = Expression.Constant(index)
        };
    })
    .Reverse()
    .Aggregate((Expression)Expression.Constant(-1), (next, item) =>
        Expression.Condition(item.Condition, item.Value, next));
    var selector = Expression.Lambda<Func<Frame, int>>(body, parameter);
    var count = frames.Count();
    return query.Where(p => p.Frames.AsQueryable().Select(selector)
        .Distinct().Count(i => i >= 0) == count);
}
Run Code Online (Sandbox Code Playgroud)

其中ReplaceParameter是以下自定义扩展方法:

public static partial class ExpressionUtils
{
    public static Expression ReplaceParameter(this Expression expression, ParameterExpression source, Expression target)
    {
        return new ParameterReplacer { Source = source, Target = target }.Visit(expression);
    }

    class ParameterReplacer : ExpressionVisitor
    {
        public ParameterExpression Source;
        public Expression Target;
        protected override Expression VisitParameter(ParameterExpression node)
        {
            return node == Source ? Target : base.VisitParameter(node);
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

生成的 SQL 包含一个巨大的CASE WHEN表达式(不幸的是在子句中加倍WHERE),但只有一个查询,并且在 SqlServer 和 PostgreSQL 中被接受并成功执行(在后一种情况下,在与原始测试相同的条件下,不到 2 秒 -两个表中都有 1K 条记录,1M 个链接,200 个条件)。