使用 OPENJSON 生成嵌套数组非常慢

Question

使用 OPENJSON 生成嵌套数组非常慢

我刚刚开始使用OPENJSONSQL Server 2016 SP1。

我有这句话：

select c.Serial as Parent,
    (Select co.Serial, agc.Position
      from AggregationChildren agc, Aggregation ag, Code co
      where agc.AggregationId = a.AggregationId 
      and co.CodeId = agc.AggregationChildrenId for json path) as children
    from Aggregation a, Code c
    where c.CodeId = a.AggregationId for json path

Run Code Online (Sandbox Code Playgroud)

要生成此 JSON：

select c.Serial as Parent,
    (Select co.Serial, agc.Position
      from AggregationChildren agc, Aggregation ag, Code co
      where agc.AggregationId = a.AggregationId 
      and co.CodeId = agc.AggregationChildrenId for json path) as children
    from Aggregation a, Code c
    where c.CodeId = a.AggregationId for json path

Run Code Online (Sandbox Code Playgroud)

但它非常非常缓慢。

我的问题是Children数组，因为我不知道如何获取它。

有没有办法更快地做到这一点？

这些是表：

CREATE TABLE [dbo].[Code] (
    [CodeId]            INT            IDENTITY (1, 1) NOT NULL,
    [Serial]            NVARCHAR (20)  NOT NULL,
    [ ... ],
    CONSTRAINT [PK_CODE] PRIMARY KEY CLUSTERED ([CodeId] ASC),
    [ ... ]
)

CREATE TABLE [dbo].[Aggregation] (
    [AggregationId] INT           NOT NULL,
    [ ... ], 
    CONSTRAINT [PK_AGGREGATIONS] PRIMARY KEY CLUSTERED ([AggregationId] ASC),
    CONSTRAINT [FK_Aggregation_Code]
           FOREIGN KEY ([AggregationId])
            REFERENCES [dbo].[Code] ([CodeId])
)

CREATE TABLE [dbo].[AggregationChildren] (
    [AggregationChildrenId] INT NOT NULL,
    [AggregationId]         INT NOT NULL,
    [Position]              INT NOT NULL,
    CONSTRAINT [PK_AGGREGATION_CHILDS] PRIMARY KEY CLUSTERED ([AggregationChildrenId] ASC),
    CONSTRAINT [FK_AggregationChildren_Code]
           FOREIGN KEY ([AggregationChildrenId])
            REFERENCES [dbo].[Code] ([CodeId]),
    CONSTRAINT [FK_AggregationChildren_Aggregation]
           FOREIGN KEY ([AggregationId])
            REFERENCES [dbo].[Aggregation] ([AggregationId]) ON DELETE CASCADE
)

Run Code Online (Sandbox Code Playgroud)

该Serial列是 an，nvarchar(20)因为值可以是字母数字的任意组合，即使我的示例仅显示数字。

Answer 1

Han*_*non 6

我很难解析您的查询，但是我相信这会返回相同的结果，并且速度要快得多：

SELECT Parent = c.Serial
    , Children = (
        SELECT c.Serial 
            , cac.Position
        FROM dbo.Code cc
            INNER JOIN dbo.AggregationChildren cac ON cac.AggregationChildrenId = cc.CodeId
        WHERE cac.AggregationId = a.AggregationId
        FOR JSON PATH 
    )
FROM dbo.Code c
    INNER JOIN dbo.Aggregation a ON c.CodeId = a.AggregationId
FOR JSON PATH;

Run Code Online (Sandbox Code Playgroud)

上述查询的计划如下所示：

您的查询计划如下所示：

如果我们添加以下索引，我们可以使第一个变体更快：

CREATE NONCLUSTERED INDEX IX_AggregationChildren_IX0
ON dbo.AggregationChildren (AggregationId)
INCLUDE (AggregationChildrenId,Position);

Run Code Online (Sandbox Code Playgroud)

但是，很明显，您需要根据您的工作量来评估这一点。

我创建了一个最低限度可行的完整示例设置以用于测试：

USE tempdb;

IF OBJECT_ID(N'dbo.AggregationChildren', N'U') IS NOT NULL 
DROP TABLE dbo.AggregationChildren;
IF OBJECT_ID(N'dbo.Aggregation', N'U') IS NOT NULL 
DROP TABLE dbo.Aggregation;
IF OBJECT_ID(N'dbo.Code', N'U') IS NOT NULL 
DROP TABLE dbo.Code;
GO

CREATE TABLE dbo.Code (
    CodeId int NOT NULL
        CONSTRAINT PK_CODE 
        PRIMARY KEY 
        CLUSTERED
    , Serial nvarchar(20) NOT NULL
);


CREATE TABLE dbo.Aggregation (
    AggregationId int NOT NULL
        CONSTRAINT PK_AGGREGATIONS 
        PRIMARY KEY 
        CLUSTERED
        CONSTRAINT FK_Aggregation_Code
        FOREIGN KEY (AggregationId)
        REFERENCES dbo.Code (CodeId)
)

CREATE TABLE dbo.AggregationChildren (
    AggregationChildrenId int NOT NULL
        CONSTRAINT PK_AGGREGATION_CHILDS 
        PRIMARY KEY 
        CLUSTERED
        CONSTRAINT FK_AggregationChildren_Code
        FOREIGN KEY (AggregationChildrenId)
        REFERENCES dbo.Code (CodeId)
    , AggregationId int NOT NULL
        CONSTRAINT FK_AggregationChildren_Aggregation
        FOREIGN KEY (AggregationId)
        REFERENCES dbo.Aggregation (AggregationId) 
        ON DELETE CASCADE
    , Position int NOT NULL
)

Run Code Online (Sandbox Code Playgroud)

我重新格式化了约束子句，使我的大脑更加友好；本质上，上面的代码与您问题中的 DDL 相同。

这将用足够的数据填充三个表以进行有意义的比较：

;WITH src AS 
(
    SELECT n.Val
    FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) n(Val)
)
INSERT INTO dbo.Code (CodeId, Serial)
SELECT s1.Val 
        + (s2.Val * 10)
        + (s3.Val * 100)
        + (s4.Val * 1000)
        + (s5.Val * 10000)
    , CONVERT(bigint, CRYPT_GEN_RANDOM(8))
FROM src s1
    CROSS JOIN src s2
    CROSS JOIN src s3
    CROSS JOIN src s4
    CROSS JOIN src s5


;WITH src AS 
(
    SELECT n.Val
    FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) n(Val)
)
INSERT INTO dbo.Aggregation (AggregationId)
SELECT s1.Val 
    + (s2.Val * 10)
    + (s3.Val * 100)
FROM src s1
    CROSS JOIN src s2
    CROSS JOIN src s3;



;WITH src AS 
(
    SELECT n.Val
    FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) n(Val)
)
INSERT INTO dbo.AggregationChildren (AggregationChildrenId, AggregationId, Position)
SELECT s1.Val 
        + (s2.Val * 10)
        + (s3.Val * 100)
        + (s4.Val * 1000)
        + (s5.Val * 10000)
    , s1.Val 
        + (s2.Val * 10)
        + (s3.Val * 100)
    , s1.Val 
FROM src s1
    CROSS JOIN src s2
    CROSS JOIN src s3
    CROSS JOIN src s4
    CROSS JOIN src s5;

Run Code Online (Sandbox Code Playgroud)

这些是每个表的行数：

???????????????????????????????????????????????????
? 代码 ？聚合？聚合儿童？
???????????????????????????????????????????????????
? 100000 ? 1000 ? 100000 ?
???????????????????????????????????????????????????

我的查询版本：

SELECT Parent = c.Serial
    , Children = (
        SELECT c.Serial 
            , cac.Position
        FROM dbo.Code cc
            INNER JOIN dbo.AggregationChildren cac ON cac.AggregationChildrenId = cc.CodeId
        WHERE cac.AggregationId = a.AggregationId
        FOR JSON PATH 
    )
FROM dbo.Code c
    INNER JOIN dbo.Aggregation a ON c.CodeId = a.AggregationId
FOR JSON PATH;

Run Code Online (Sandbox Code Playgroud)

为了比较两个查询的输出，我创建了两个用户定义的函数，如下所示：

CREATE FUNCTION dbo.fn_json_test_1()
RETURNS nvarchar(max)
AS
BEGIN
    RETURN (
        SELECT Parent = c.Serial
            , Children = (
                SELECT c.Serial 
                    , cac.Position
                FROM dbo.Code cc
                    INNER JOIN dbo.AggregationChildren cac ON cac.AggregationChildrenId = cc.CodeId
                WHERE cac.AggregationId = a.AggregationId
                FOR JSON PATH 
            )
        FROM dbo.Code c
            INNER JOIN dbo.Aggregation a ON c.CodeId = a.AggregationId
        FOR JSON PATH
    );
END;
GO


GO
CREATE FUNCTION dbo.fn_json_test_2()
RETURNS nvarchar(max)
AS
BEGIN
    RETURN (
        SELECT c.Serial as Parent,
            (Select co.Serial, agc.Position
              from AggregationChildren agc, Aggregation ag, Code co
              where agc.AggregationId = a.AggregationId 
              and co.CodeId = agc.AggregationChildrenId for json path) as children
        from Aggregation a, Code c
        where c.CodeId = a.AggregationId for json path
    );
END;
GO

Run Code Online (Sandbox Code Playgroud)

现在，我可以通过以下方式比较两个查询的输出：

DECLARE @res1 nvarchar(max) = dbo.fn_json_test_1();
DECLARE @res2 nvarchar(max) = dbo.fn_json_test_2();

SELECT CASE WHEN @res1 <> @res2 THEN 'mismatch' ELSE 'match' END;

Run Code Online (Sandbox Code Playgroud)

结果是：

结果不匹配。我的查询的输出包含的子节点少于您的查询。我将回到绘图板，并将简化测试台以查看差异所在。

简化的测试台由Code表中的 10 行、Aggregation（父）表中的 2 行和AggregationChildren（子）表中的 8 行组成：

;WITH src AS 
(
    SELECT n.Val
    FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) n(Val)
)
INSERT INTO dbo.Code (CodeId, Serial)
SELECT s1.Val 
    , CONVERT(bigint, CRYPT_GEN_RANDOM(8))
FROM src s1


;WITH src AS 
(
    SELECT n.Val
    FROM (VALUES (0), (1)) n(Val)
)
INSERT INTO dbo.Aggregation (AggregationId)
SELECT s1.Val 
FROM src s1;



;WITH src AS 
(
    SELECT n.Val
    FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7)) n(Val)
)
INSERT INTO dbo.AggregationChildren (AggregationChildrenId, AggregationId, Position)
SELECT s1.Val + 2
    , s1.Val % 2
    , s1.Val 
FROM src s1;

Run Code Online (Sandbox Code Playgroud)

行数：

SELECT Code = (SELECT COUNT(1) FROM dbo.Code)
    , Aggregation = (SELECT COUNT(1) FROM dbo.Aggregation)
    , AggregationChildren = (SELECT COUNT(1) FROM dbo.AggregationChildren)

Run Code Online (Sandbox Code Playgroud)

????????????????????????????????????????????????
? 代码 ？聚合？聚合儿童？
????????????????????????????????????????????????
? 10 ? 2 ? 8 ?
????????????????????????????????????????????????

预测模式应该是两个父 json 数组，每个数组有 4 个子数组。

我的结果：

[
  {
    "父": "-5601362097731340301",
    “孩子们”： [
      {
        "序列号": "-5601362097731340301",
        “位置”：0
      },
      {
        "序列号": "-5601362097731340301",
        “位置”：2
      },
      {
        "序列号": "-5601362097731340301",
        “位置”：4
      },
      {
        "序列号": "-5601362097731340301",
        “位置”：6
      }
    ]
  },
  {
    "父": "-8896860091721838065",
    “孩子们”： [
      {
        "序列号": "-8896860091721838065",
        “位置”：1
      },
      {
        "序列号": "-8896860091721838065",
        “位置”：3
      },
      {
        "序列号": "-8896860091721838065",
        “位置”：5
      },
      {
        "序列号": "-8896860091721838065",
        “位置”：7
      }
    ]
  }

您的查询：

[
  {
    "父": "-5601362097731340301",
    “孩子们”： [
      {
        "序列号": "5802227619253639548",
        “位置”：0
      },
      {
        "序列号": "5802227619253639548",
        “位置”：0
      },
      {
        “序列号”：“4504664379821512162”，
        “位置”：2
      },
      {
        “序列号”：“4504664379821512162”，
        “位置”：2
      },
      {
        "序列号": "6561435639659176802",
        “位置”：4
      },
      {
        "序列号": "6561435639659176802",
        “位置”：4
      },
      {
        "序列号": "-7417083263182709739",
        “位置”：6
      },
      {
        "序列号": "-7417083263182709739",
        “位置”：6
      }
    ]
  },
  {
    "父": "-8896860091721838065",
    “孩子们”： [
      {
        "序列号": "-7646118996434234523",
        “位置”：1
      },
      {
        "序列号": "-7646118996434234523",
        “位置”：1
      },
      {
        "序列号": "-6372739442099935942",
        “位置”：3
      },
      {
        "序列号": "-6372739442099935942",
        “位置”：3
      },
      {
        "序列号": "-882384147532911428",
        “位置”：5
      },
      {
        "序列号": "-882384147532911428",
        “位置”：5
      },
      {
        "序列号": "4293317573306886053",
        “位置”：7
      },
      {
        "序列号": "4293317573306886053",
        “位置”：7
      }
    ]
  }
]

您的查询有太多子项；我的查询返回预测的孩子数，它返回正确的Position值，但是返回不正确的Serial值。

我的查询中的“错误”出现在内部查询中。不正确的查询是：

SELECT c.Serial 
    , cac.Position
FROM dbo.Code cc
    INNER JOIN dbo.AggregationChildren cac ON cac.AggregationChildrenId = cc.CodeId
WHERE cac.AggregationId = a.AggregationId
ORDER BY c.Serial
FOR JSON PATH

Run Code Online (Sandbox Code Playgroud)

正确的版本是：

SELECT cc.Serial --changed "c." to "cc."
    , cac.Position
FROM dbo.Code cc
    INNER JOIN dbo.AggregationChildren cac ON cac.AggregationChildrenId = cc.CodeId
WHERE cac.AggregationId = a.AggregationId
ORDER BY cc.CodeId --not a big deal, but different order for children in output
FOR JSON PATH

Run Code Online (Sandbox Code Playgroud)

更正后的查询现在看起来像：

SELECT  Parent = c.Serial
    , Children = (
        SELECT cc.Serial 
            , cac.Position
        FROM dbo.Code cc
            INNER JOIN dbo.AggregationChildren cac ON cac.AggregationChildrenId = cc.CodeId
        WHERE cac.AggregationId = a.AggregationId
        ORDER BY cc.CodeId
        FOR JSON PATH 
    )
FROM dbo.Code c
    INNER JOIN dbo.Aggregation a ON c.CodeId = a.AggregationId
ORDER BY c.Serial
FOR JSON PATH;

Run Code Online (Sandbox Code Playgroud)

并返回以下结果：

[
  {
    "父": "-195930341251513493",
    “孩子们”： [
      {
        "序列号": "-6126601633786720400",
        “位置”：1
      },
      {
        "序列号": "5216562173012877678",
        “位置”：3
      },
      {
        "序列号": "-1992909345438478098",
        “位置”：5
      },
      {
        "序列号": "8329388691987940194",
        “位置”：7
      }
    ]
  },
  {
    "家长": "8774608126018975726",
    “孩子们”： [
      {
        "序列号": "-3380643917643646211",
        “位置”：0
      },
      {
        "序列号": "-2042609074595538493",
        “位置”：2
      },
      {
        "序列号": "7345460002653774160",
        “位置”：4
      },
      {
        "序列号": "-2126530822210070443",
        “位置”：6
      }
    ]
  }
]

归档时间：	8 年，2 月前
查看次数：	1229 次
最近记录：	8 年，2 月前