在一个查询中选择多个传感器值

m__*_*m__ 4 sql-server partitioning select sql-server-2012 group-by

背景

我有几个设备,每个设备都有几个传感器。我不时地记录这些并将它们存储在下面描述的表中。当有人请求一个网页时,我会一个接一个地获取这些值中的几个(最新记录的)并将它们显示给用户。但是目前这需要很长时间,因为需要提取的值太多,每个值的提取需要大约 8 毫秒,并且我们总共讨论了大约 300 毫秒的总页面加载时间增加 - 对于一个相对较好的页面。

CREATE TABLE [dbo].[SensorValues](
  [DeviceId] [int] NOT NULL,
  [SensorId] [int] NOT NULL,
  [SensorValue] [int] NOT NULL,
  [Date] [int] NOT NULL, --- stored as unixtime
CONSTRAINT [PK_SensorValues] PRIMARY KEY CLUSTERED 
(
  [DeviceId] ASC,
  [SensorId] ASC,
  [Date] DESC
);
Run Code Online (Sandbox Code Playgroud)

该表在日期列上每周进行分区。

我现在应该做什么

所以,我做的是以下。我选择每个分区中当前日期/时间之前的最大值。并选出最大的值。

SELECT TOP (1) ca.SensorValue, ca.Date
  FROM sys.partitions AS p
  CROSS APPLY
  (
  SELECT TOP (1) v.Date, v.SensorValue
    FROM SensorValue AS v
    WHERE $PARTITION.SensorValues_Date_PF(v.Date) = p.[partition_number]
    AND v.DeviceId = @fDeviceId
    AND v.SensorId = @fSensorId
    AND v.Date <= @fDate
    ORDER BY v.Date DESC
  ) AS ca
  WHERE p.[partition_number] <= $PARTITION.SensorValues_Date_PF(@fDate)
  AND p.[object_id] = OBJECT_ID(N'dbo.SensorValues', N'U')
  AND p.index_id = 1
  ORDER BY p.[partition_number] DESC, ca.Date DESC;
Run Code Online (Sandbox Code Playgroud)

我想做的事

我想在一个查询中选择所有值。例如,选择 DeviceId=1 和 SensorId=1,2,3,4,5 的最新值。到目前为止,我已经提出了以下内容,其中我使用 IN 关键字选择以获取多个传感器的值。但是,我仍然需要将它们分组并整理出日期最高的那个。我正在考虑添加一个 GROUP BY 子句,但不知道如何正确使用(到目前为止我尝试过的那些都失败了)。

SELECT ca.SensorValue, ca.Date
  FROM sys.partitions AS p
  CROSS APPLY
  (
  SELECT TOP (1) v.Date, v.SensorValue
    FROM SensorValue AS v
    WHERE $PARTITION.SensorValues_Date_PF(v.Date) = p.[partition_number]
    AND v.DeviceId = @fDeviceId
    AND v.SensorId IN (@fSensorId1, @fSensorId2, @fSensorId3)
    AND v.Date <= @fDate
    ORDER BY v.Date DESC
  ) AS ca
  WHERE p.[partition_number] <= $PARTITION.SensorValues_Date_PF(@fDate)
  AND p.[object_id] = OBJECT_ID(N'dbo.SensorValues', N'U')
  AND p.index_id = 1
  ORDER BY p.[partition_number] DESC, ca.Date DESC;
Run Code Online (Sandbox Code Playgroud)

Pau*_*ite 7

首先,我注意到您的“我现在做什么”查询:

SELECT TOP (1)
    ca.SensorValue,
    ca.Date
FROM sys.partitions AS p
CROSS APPLY
(
    SELECT TOP (1)
        v.Date, 
        v.SensorValue
    FROM SensorValues AS v
    WHERE 
        $PARTITION.SensorValues_Date_PF(v.Date) = p.[partition_number]
        AND v.DeviceId = @fDeviceId
        AND v.SensorId = @fSensorId
        AND v.Date <= @fDate
    ORDER BY 
        v.Date DESC
) AS ca
WHERE 
    p.[partition_number] <= $PARTITION.SensorValues_Date_PF(@fDate)
    AND p.[object_id] = OBJECT_ID(N'dbo.SensorValues', N'U')
    AND p.index_id = 1
ORDER BY
    p.[partition_number] DESC, 
    ca.Date DESC;
Run Code Online (Sandbox Code Playgroud)

...产生这样的执行计划:

原计划

该执行计划的估计总成本为0.02 个单位。超过 50% 的估计成本是最终排序,以 Top-N 模式运行。现在估计就是这样,但是排序通常很昂贵,所以让我们在不改变语义的情况下删除它:

SELECT TOP (1)
    ca.SensorId,
    ca.SensorValue,
    ca.Date
FROM
(
    -- Partition numbers
    SELECT DISTINCT
        partition_number = prv.boundary_id
    FROM
        sys.partition_functions AS pf
    JOIN sys.partition_range_values AS prv ON
        prv.function_id = pf.function_id
    WHERE
        pf.name = N'SensorValues_Date_PF'
        AND prv.boundary_id <= $PARTITION.SensorValues_Date_PF(@fDate)
) AS p
CROSS APPLY
    (
    SELECT TOP (1)
        v.Date,
        v.SensorValue,
        v.SensorId
    FROM dbo.SensorValues AS v
    WHERE
        $PARTITION.SensorValues_Date_PF(v.Date) = p.partition_number
        AND v.DeviceId = @fDeviceId
        AND v.SensorId = @fSensorId
        AND v.Date <= @fDate
    ORDER BY
        v.Date DESC
  ) AS ca
ORDER BY
    p.partition_number DESC,
    ca.Date DESC
Run Code Online (Sandbox Code Playgroud)

现在执行计划没有阻塞操作符,也没有特别的排序。下面的新查询计划的估计成本是0.01 个单位,总成本平均分布在数据访问方法上:

改进的查询计划

随着改进到位,我们需要为每个传感器 ID 生成一个结果,就是为每个传感器 ID 和APPLY之前的代码制作一个列表:

SELECT
    PerSensor.SensorId,
    PerSensor.SensorValue,
    PerSensor.Date
FROM 
(
    -- Sensor ID list
    VALUES 
        (@fSensorId1),
        (@FSensorId2),
        (@FSensorId3)
) AS Sensor (Id)
CROSS APPLY
(
    -- Optimized code applied to each sensor
    SELECT TOP (1)
        ca.SensorId,
        ca.SensorValue,
        ca.Date
    FROM
    (
        -- Partition numbers
        SELECT DISTINCT
            partition_number = prv.boundary_id
        FROM
            sys.partition_functions AS pf
        JOIN sys.partition_range_values AS prv ON
            prv.function_id = pf.function_id
        WHERE
            pf.name = N'SensorValues_Date_PF'
            AND prv.boundary_id <= $PARTITION.SensorValues_Date_PF(@fDate)
    ) AS p
    CROSS APPLY
        (
        SELECT TOP (1)
            v.Date,
            v.SensorValue,
            v.SensorId
        FROM dbo.SensorValues AS v
        WHERE
            $PARTITION.SensorValues_Date_PF(v.Date) = p.partition_number
            AND v.DeviceId = @fDeviceId
            AND v.SensorId = Sensor.Id--@fSensorId1
            AND v.Date <= @fDate
        ORDER BY
            v.Date DESC
      ) AS ca
    ORDER BY
        p.partition_number DESC,
        ca.Date DESC
) AS PerSensor;
Run Code Online (Sandbox Code Playgroud)

查询计划是:

最终查询计划

三个传感器 ID 的估计查询计划成本为0.011 - 原始单传感器计划成本的一半。