CodeFirst加载1个父项链接到25 000个孩子是缓慢的

Cur*_*ire 13 c# sql-server orm entity-framework ef-code-first

我搜索了很多关于我的性能问题并尝试了各种不同的东西,但我似乎无法让它足够快地工作.这是我最简单的问题:

我正在使用实体框架5,并且我希望能够在用户选择父项时延迟加载父项的子实例,因此我不必拉动整个数据库.然而,我一直有延迟加载孩子的性能问题.我认为问题是父和孩子之间的导航属性的连线.我也认为一定是我做错了,因为我相信这是一个简单的案例.

所以我建立了一个程序来测试单个延迟负载以隔离问题.

这是测试:

我创建了一个POCO Parent类和一个Child POCO类.父母有n个孩子,孩子有1个父母.SQL Server数据库中只有1个父项,单个父项只有25 000个子项.我尝试了不同的方法来加载这些数据.每当我在同一个DbContext中加载子项和父项时,都需要很长时间.但是如果我在不同的DbContexts中加载它们,它加载速度非常快.但是,我希望这些实例位于相同的DbContext中.

这是我的测试设置以及复制它所需的一切:

波苏斯:

public class Parent
{
    public int ParentId { get; set; }

    public string Name { get; set; }

    public virtual List<Child> Childs { get; set; }
}

public class Child
{
    public int ChildId { get; set; }

    public int ParentId { get; set; }

    public string Name { get; set; }

    public virtual Parent Parent { get; set; }
}
Run Code Online (Sandbox Code Playgroud)

的DbContext:

public class Entities : DbContext
{
    public DbSet<Parent> Parents { get; set; }

    public DbSet<Child> Childs { get; set; }
}
Run Code Online (Sandbox Code Playgroud)

用于创建数据库和数据的TSQL脚本:

USE [master]
GO

IF EXISTS(SELECT name FROM sys.databases
    WHERE name = 'PerformanceParentChild')
    alter database [PerformanceParentChild] set single_user with rollback immediate
    DROP DATABASE [PerformanceParentChild]
GO

CREATE DATABASE [PerformanceParentChild]
GO
USE [PerformanceParentChild]
GO
BEGIN TRAN T1;
SET NOCOUNT ON

CREATE TABLE [dbo].[Parents]
(
    [ParentId] [int] CONSTRAINT PK_Parents PRIMARY KEY,
    [Name] [nvarchar](200) NULL
)
GO

CREATE TABLE [dbo].[Children]
(
    [ChildId] [int] CONSTRAINT PK_Children PRIMARY KEY,
    [ParentId] [int] NOT NULL,
    [Name] [nvarchar](200) NULL
)
GO

INSERT INTO Parents (ParentId, Name)
VALUES (1, 'Parent')

DECLARE @nbChildren int;
DECLARE @childId int;

SET @nbChildren = 25000;
SET @childId = 0;

WHILE @childId < @nbChildren
BEGIN
   SET @childId = @childId + 1;
   INSERT INTO [dbo].[Children] (ChildId, ParentId, Name)
   VALUES (@childId, 1, 'Child #' + convert(nvarchar(5), @childId))
END

CREATE NONCLUSTERED INDEX [IX_ParentId] ON [dbo].[Children] 
(
    [ParentId] ASC
)
GO

ALTER TABLE [dbo].[Children] ADD CONSTRAINT [FK_Children.Parents_ParentId] FOREIGN KEY([ParentId])
REFERENCES [dbo].[Parents] ([ParentId])
GO

COMMIT TRAN T1;
Run Code Online (Sandbox Code Playgroud)

包含连接字符串的App.config:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <connectionStrings>
    <add
      name="Entities"
      providerName="System.Data.SqlClient"
      connectionString="Server=localhost;Database=PerformanceParentChild;Trusted_Connection=true;"/>
  </connectionStrings>
</configuration>
Run Code Online (Sandbox Code Playgroud)

测试控制台类:

class Program
{
    static void Main(string[] args)
    {
        List<Parent> parents;
        List<Child> children;

        Entities entities;
        DateTime before;
        TimeSpan childrenLoadElapsed;
        TimeSpan parentLoadElapsed;

        using (entities = new Entities())
        {
            before = DateTime.Now;
            parents = entities.Parents.ToList();
            parentLoadElapsed = DateTime.Now - before;
            System.Diagnostics.Debug.WriteLine("Load only the parent from DbSet:" + parentLoadElapsed.TotalSeconds + " seconds");
        }

        using (entities = new Entities())
        {
            before = DateTime.Now;
            children = entities.Childs.ToList();
            childrenLoadElapsed = DateTime.Now - before;
            System.Diagnostics.Debug.WriteLine("Load only the children from DbSet:" + childrenLoadElapsed.TotalSeconds + " seconds");
        }

        using (entities = new Entities())
        {
            before = DateTime.Now;
            parents = entities.Parents.ToList();
            parentLoadElapsed = DateTime.Now - before;

            before = DateTime.Now;
            children = entities.Childs.ToList();
            childrenLoadElapsed = DateTime.Now - before;
            System.Diagnostics.Debug.WriteLine("Load the parent from DbSet:" + parentLoadElapsed.TotalSeconds + " seconds" +
                                               ", then load the children from DbSet:" + childrenLoadElapsed.TotalSeconds + " seconds");
        }

        using (entities = new Entities())
        {
            before = DateTime.Now;
            children = entities.Childs.ToList();
            childrenLoadElapsed = DateTime.Now - before;

            before = DateTime.Now;
            parents = entities.Parents.ToList();
            parentLoadElapsed = DateTime.Now - before;


            System.Diagnostics.Debug.WriteLine("Load the children from DbSet:" + childrenLoadElapsed.TotalSeconds + " seconds" +
                                               ", then load the parent from DbSet:" + parentLoadElapsed.TotalSeconds + " seconds");
        }

        using (entities = new Entities())
        {
            before = DateTime.Now;
            parents = entities.Parents.ToList();
            parentLoadElapsed = DateTime.Now - before;

            before = DateTime.Now;
            children = parents[0].Childs;
            childrenLoadElapsed = DateTime.Now - before;
            System.Diagnostics.Debug.WriteLine("Load the parent from DbSet:" + parentLoadElapsed.TotalSeconds + " seconds" +
                                               ", then load the children from Parent's lazy loaded navigation property:" + childrenLoadElapsed.TotalSeconds + " seconds");
        }

        using (entities = new Entities())
        {
            before = DateTime.Now;
            parents = entities.Parents.Include(p => p.Childs).ToList();
            parentLoadElapsed = DateTime.Now - before;
            System.Diagnostics.Debug.WriteLine("Load the parent from DbSet and children from include:" + parentLoadElapsed.TotalSeconds + " seconds");

        }

        using (entities = new Entities())
        {
            entities.Configuration.ProxyCreationEnabled = false;
            entities.Configuration.AutoDetectChangesEnabled = false;
            entities.Configuration.LazyLoadingEnabled = false;
            entities.Configuration.ValidateOnSaveEnabled = false;

            before = DateTime.Now;
            parents = entities.Parents.Include(p => p.Childs).ToList();
            parentLoadElapsed = DateTime.Now - before;
            System.Diagnostics.Debug.WriteLine("Load the parent from DbSet and children from include:" + parentLoadElapsed.TotalSeconds + " seconds with everything turned off");

        }

    }
}
Run Code Online (Sandbox Code Playgroud)

以下是这些测试的结果:

仅从DbSet加载父级:0,972秒

仅从DbSet加载子项:0,714秒

从DbSet加载父项:0,001秒,然后从DbSet加载子项:8,6026秒

从DbSet加载子项:0,6864秒,然后从DbSet加载父项:7,5816159秒

从DbSet加载父级:0秒,然后从父级的延迟加载导航属性加载子级:8,5644549秒

从DbSet加载父项,从include:8,6428788秒加载子项

从DbSet加载父项,并从include:9,1416586秒加载子项,并关闭所有内容

分析

每当父项和子项都在同一个DbContext中时,需要花费很长时间(9秒)来连接所有内容.我甚至试图关闭从代理创建到延迟加载的所有内容,但无济于事.有人可以帮帮我吗 ?

Sla*_*uma 5

这不是答案,因为我没有提高性能的解决方案,但评论部分没有足够的空间来进行以下操作.我只想补充一些额外的测试和观察.

首先,我几乎可以为所有七次测试重现您的测量时间.我使用EF 4.1进行测试.

一些有趣的事情需要注意:

  • 从(快速)测试2我得出结论,对象实现(将从数据库服务器返回的行和列转换为对象)并不慢.

  • 这也可以通过在没有更改跟踪的情况下加载测试3中的实体来确认:

    parents = entities.Parents.AsNoTracking().ToList();
    // ...
    children = entities.Childs.AsNoTracking().ToList();
    
    Run Code Online (Sandbox Code Playgroud)

    此代码运行速度很快,但也必须实现25001个对象(但不会建立导航属性之间的关系!).

  • 同样来自(快速)测试2我会得出结论,为变更跟踪创建实体快照并不慢.

  • 在测试3和4中,当从数据库加载实体时,父和25000个子节点之间的关系得到修复,即EF将所有Child实体添加到父Childs集合中,并将Parent每个子节点设置为加载的父节点.你已经猜到了,显然这一步很慢:

    我认为问题是父和孩子之间的导航属性的连线.

    特别是关系的收集方似乎是问题:如果你注释掉类中的Childs导航属性Parent(关系仍然是一对多的关系,那么)测试3和4很快,尽管EF仍设置Parent所有25000个Child实体的财产.

    我不知道为什么在关系修复期间填充导航集合是如此之慢.如果你以天真的方式手动模拟它,就像这样......

    entities.Configuration.ProxyCreationEnabled = false;
    
    children = entities.Childs.AsNoTracking().ToList();
    parents = entities.Parents.AsNoTracking().ToList();
    
    parents[0].Childs = new List<Child>();
    foreach (var c in children)
    {
        if (c.ParentId == parents[0].ParentId)
        {
            c.Parent = parents[0];
            parents[0].Childs.Add(c);
        }
    }
    
    Run Code Online (Sandbox Code Playgroud)

    ......它很快.显然,内部的关系修复不会以这种简单的方式工作.也许需要检查集合是否已包含要测试的子项:

    foreach (var c in children)
    {
        if (c.ParentId == parents[0].ParentId)
        {
            c.Parent = parents[0];
            if (!parents[0].Childs.Contains(c))
                parents[0].Childs.Add(c);
        }
    }
    
    Run Code Online (Sandbox Code Playgroud)

    这明显变慢(大约4秒).

无论如何,关系修复似乎是性能瓶颈.如果您需要更改跟踪并纠正附加实体之间的关系,我不知道如何改进它.


Lad*_*nka 5

我以前回答过类似的问题.我之前的回答包含了回答这个问题的理论,但是通过您的详细问题,我可以直接指出问题所在.首先,让我们使用性能分析器运行其中一个有问题的案例.这是使用跟踪模式时DotTrace的结果:

在此输入图像描述

修复关系在循环中运行.这意味着,25.000记录你有25.000迭代,但每个迭代的内部调用CheckIfNavigationPropertyContainsEntityEntityCollection:

internal override bool CheckIfNavigationPropertyContainsEntity(IEntityWrapper wrapper)
{
    if (base.TargetAccessor.HasProperty)
    {
        object navigationPropertyValue = base.WrappedOwner.GetNavigationPropertyValue(this);
        if (navigationPropertyValue != null)
        {
            if (!(navigationPropertyValue is IEnumerable))
            {
                throw new EntityException(Strings.ObjectStateEntry_UnableToEnumerateCollection(base.TargetAccessor.PropertyName, base.WrappedOwner.Entity.GetType().FullName));
            }
            foreach (object obj3 in navigationPropertyValue as IEnumerable)
            {
                if (object.Equals(obj3, wrapper.Entity))
                {
                    return true;
                }
            }
        }
    }
    return false;
}
Run Code Online (Sandbox Code Playgroud)

当项目添加到导航属性时,内部循环的迭代次数会增加.数学是我之前的答案 - 它是算术系列,其中内循环的迭代总数是1/2*(n ^ 2 - n)=> n ^ 2复杂度.外部循环内部的循环在您的情况下导致312.487.500次迭代,性能跟踪也会显示.

我为此问题在EF CodePlex上创建了工作项.