获取 IO 统计请求 sql 之类的活动监视器

Question

获取 IO 统计请求 sql 之类的活动监视器

作为 SQL 请求的结果，是否可以获得与“数据库 IO”的活动监视器 (SSMS) 中相同的值？

点击放大

Answer 1

您可以使用这样的查询来计算读/写速率和延迟（尽管由于我的懒惰，这些数字的单位与活动监视器不匹配）。

SELECT d.name, f.name, f.type_desc, f.physical_name,
 [read b/ms] = num_of_bytes_read * 1.0/sample_ms,
 [avg read latency] =
   (1.0*s.io_stall_read_ms / (COALESCE(NULLIF(s.num_of_reads,0),1))),
 [write b/ms] = num_of_bytes_written * 1.0/sample_ms,
 [avg write latency] = 
   (1.0*s.io_stall_write_ms / (COALESCE(NULLIF(s.num_of_writes,0),1)))
FROM sys.master_files AS f
INNER JOIN sys.databases AS d
ON f.database_id = d.database_id
INNER JOIN sys.dm_io_virtual_file_stats(default, default) AS s
ON d.database_id = s.database_id
AND f.[file_id] = s.[file_id];

Run Code Online (Sandbox Code Playgroud)

有关更多信息，请参阅这些资源：

如果您想模拟图形并实时捕获峰值，那么您需要构建脚手架来收集此（或其他）查询的快照，并比较增量以生成指标。在问答网站上为您写这些东西并不是一件容易的事——人们为使这种事情变得容易的高质量监控工具支付了大笔资金。:-)

如果您想了解 Activity Monitor 如何执行此类操作，#am_generate_waitstats只需稍加努力即可获得该过程的定义。我为你做了繁重的工作，这里是它所有未格式化的荣耀 - 这就是它计算各种等待统计数据的方式：

CREATE PROC #am_generate_waitstats AS 
BEGIN

-- Setup query starts a tran -- make sure it wasn't orphaned
WHILE (@@TRANCOUNT > 0) COMMIT TRAN;

-- Get the most recent snapshot ID in the history table ('previous snapshot')
DECLARE @previous_snapshot_id bigint;
DECLARE @previous_collection_time datetime;
SELECT @previous_snapshot_id = ISNULL (MAX (snapshot_id), 0) FROM #am_wait_stats_snapshots;
SELECT TOP 1 @previous_collection_time = ISNULL (collection_time, GETUTCDATE()) FROM #am_wait_stats_snapshots 
WHERE snapshot_id = @previous_snapshot_id;

-- The snapshot we're about to capture ('current snapshot')
DECLARE @current_snapshot_id bigint;
DECLARE @current_collection_time datetime;
SET @current_collection_time = GETUTCDATE();
SET @current_snapshot_id = @previous_snapshot_id + 1;

-- The snapshots table holds the two most recent snapshots. Delete the older of the two to 
-- make room for a new snapshot. 
DELETE FROM #am_wait_stats_snapshots WHERE snapshot_id < @previous_snapshot_id;
DELETE FROM #am_resource_mon_snap WHERE current_snapshot_id < @previous_snapshot_id;

DECLARE @interval_sec numeric (20, 4);
SET @interval_sec = 
    CASE 
        -- Avoid int overflow that DATEDIFF (ms, ...) can cause in the event of a huge gap between intervals
        WHEN DATEDIFF (second, @previous_collection_time, @current_collection_time) >= 10 
            THEN DATEDIFF (second, @previous_collection_time, @current_collection_time)
        -- Avoid divide-by-zero
        WHEN DATEDIFF (millisecond, @previous_collection_time, @current_collection_time) = 0.0 
            THEN 0.0001
        ELSE DATEDIFF (millisecond, @previous_collection_time, @current_collection_time) / 1000.0
    END;

-- This query captures in-progress and completed (cumulative) wait time for each wait type
INSERT INTO #am_wait_stats_snapshots 
SELECT 
    @previous_snapshot_id + 1 AS snapshot_id, 
    @current_collection_time AS collection_time, 
    wait_type, 
    SUM (waiting_tasks_count) AS waiting_tasks_count, 
    SUM (signal_wait_time_ms) AS signal_wait_time_ms, 
    SUM (wait_time_ms) AS wait_time_ms, 
    SUM (raw_wait_time_ms) AS raw_wait_time_ms
FROM 
(
    -- global server wait stats (completed waits only)
    SELECT 
        wait_type, 
        waiting_tasks_count, 
        (wait_time_ms - signal_wait_time_ms) AS wait_time_ms,  
        signal_wait_time_ms, 
        wait_time_ms AS raw_wait_time_ms
    FROM sys.dm_os_wait_stats
    WHERE waiting_tasks_count > 0 OR wait_time_ms > 0 OR signal_wait_time_ms > 0
    UNION ALL 
    -- threads in an in-progress wait (not yet completed waits)
    SELECT 
        wait_type, 
        1 AS waiting_tasks_count, 
        wait_duration_ms AS wait_time_ms, 
        0 AS signal_wait_time_ms, 
        wait_duration_ms AS raw_wait_time_ms
    FROM sys.dm_os_waiting_tasks
    -- Very brief waits quickly will roll into dm_os_wait_stats; we only need to 
    -- query dm_os_waiting_tasks to handle longer-lived waits. 
    WHERE wait_duration_ms > 1000
) AS merged_wait_stats
GROUP BY merged_wait_stats.wait_type;

/*
Now join the current snapshot to the prior snapshot to calculate the wait time for 
the just-completed time interval. 

The previous_snapshot derived table represents cumulative wait stats at the beginning 
of the just-completed time interval, while current_snapshot is wait stats at the end of 
the interval. By subtracting the start cumulative wait time from the end cumulative 
wait time, we can calculate the wait time that accumulated during this time interval. 

The query uses a CTE to expose the interval-specific stats that are the result of the 
comparison of the current snapshot to the previous snapshot. 

Notes on the calculation of the [weighted_average_wait_time_per_sec] column: 
-------------------------
We use a weighted average formula to provide the user with a more stable indicator of 
the recent waittime. This is primarily useful if the user has selected a rapid refresh 
rate.  With a sample interval of 1 second, for example, it could be hard to assess the 
overall bottleneck on the system if short-term fluxuations in waittime distribution 
were creating constant changes in the 'worst' wait category for the just-completed 1 
second sample interval. 

The goal of this is to make the average of (approximately) the past 30-60 seconds 
dominate the value.  The longer the sample interval, the greater the weight that is 
given to the just-completed sample interval. With a 1-second sample interval, the 
just-completed sample is given a weight of about 10%.  With a 5 second sample 
interval, it is given a weight of about 50%.  Sample intervals longer than 5 seconds 
quickly approach 100% (a 60 second interval is weighted as ~80%, and a 10 minute 
interval at ~95%).  

The formula for the weight W of the current interval is below (T is the duration of 
the just-completed time interval, in seconds): 

    W  =  (1 - (T / (T^1.4 + 0.1)))

You can see the curve of the weight function by running this TSQL: 
    DECLARE @interval_sec numeric(20, 2)
    DECLARE @current_interval_weight numeric (20, 2)
    SET @interval_sec = 1
    PRINT CONVERT (char(20), 'Interval (sec)') + 'Interval Weight'
    PRINT CONVERT (char(20), '==============') + '==============='
    WHILE (@interval_sec <= 6000)
    BEGIN
        SET @current_interval_weight = 1 - (@interval_sec / (POWER (@interval_sec, 1.4) + 0.1))
        PRINT CONVERT (char(20), @interval_sec) + CONVERT (char(20), @current_interval_weight) 
            + '  ' + REPLICATE ('-', @current_interval_weight * 100)
        SET @interval_sec = CASE WHEN @interval_sec = 1 THEN 5 ELSE @interval_sec + 5 END
    END

The wait time for the just-completed interval is combined with the previous weighted 
average in the following way, where P is the previous sample's weighted average wait 
time, C is the current interval's wait time, and W is @current_interval_weight (the % 
weight given to the current waittime, using the prior formula): 

    ((1-W) * P) + (W * C)

*/

DECLARE @current_interval_weight numeric (10,6);
-- SET @current_interval_weight = @interval_sec / 60;
SET @current_interval_weight = 1 - (@interval_sec / (POWER (@interval_sec, 1.4) + 0.1))
IF @current_interval_weight > 1 SET @current_interval_weight = 1;

WITH interval_waitstats AS 
(
    -- First get resource wait stats for this interval. 
    SELECT 
        #am_wait_types.category_name, 
        current_snapshot.wait_type, 
        -- All wait stats will be reset to zero by a service cycle, which will cause 
        -- (snapshot2waittime-snapshot1waittime) calculations to produce an incorrect 
        -- negative wait time for the interval.  Detect this and avoid calculating 
        -- negative wait time/wait count/signal time deltas. 
        CASE 
            WHEN (current_snapshot.waiting_tasks_count - previous_snapshot.waiting_tasks_count) < 0 THEN current_snapshot.waiting_tasks_count 
            ELSE (current_snapshot.waiting_tasks_count - previous_snapshot.waiting_tasks_count) 
        END AS interval_waiting_tasks_count, 
        -- Use [raw_wait_time_ms] for this check because [wait_time_ms] is a calculated 
        -- value and tiny differences in sample time for signal wait time vs. wait time 
        -- can cause its calculated value to be slightly negative even when stats weren't 
        -- reset. 
        CASE 
            WHEN (current_snapshot.raw_wait_time_ms - previous_snapshot.raw_wait_time_ms) < 0 THEN current_snapshot.wait_time_ms
            ELSE (current_snapshot.wait_time_ms - previous_snapshot.wait_time_ms)
        END AS interval_resource_wait_time, 
        CASE 
            WHEN (current_snapshot.signal_wait_time_ms - previous_snapshot.signal_wait_time_ms) < 0 THEN current_snapshot.signal_wait_time_ms 
            ELSE (current_snapshot.signal_wait_time_ms - previous_snapshot.signal_wait_time_ms) 
        END AS interval_resource_signal_time, 
        current_snapshot.wait_time_ms AS resource_wait_time_cumulative 
    FROM (
            SELECT * FROM #am_wait_stats_snapshots WHERE snapshot_id = @previous_snapshot_id
        ) AS previous_snapshot
    INNER JOIN (
            SELECT * FROM #am_wait_stats_snapshots WHERE snapshot_id = @current_snapshot_id
        ) AS current_snapshot
        ON previous_snapshot.wait_type = current_snapshot.wait_type 
    INNER JOIN #am_wait_types ON #am_wait_types.wait_type = current_snapshot.wait_type
    WHERE #am_wait_types.ignore != 1 AND #am_wait_types.category_name != 'Idle'
)
INSERT INTO #am_resource_mon_snap
SELECT 
    @previous_snapshot_id AS previous_snapshot_id, 
    @current_snapshot_id AS current_snapshot_id, 
    @previous_collection_time AS previous_collection_time, 
    @current_collection_time AS current_collection_time, 
    @interval_sec AS interval_sec, 
    interval_waitstats.category_name, 
    interval_waitstats.wait_type, 
    interval_waitstats.interval_waiting_tasks_count, 
    -- Tiny differences in sample time for signal wait time vs. wait time can cause our 
    -- calculated wait time to be slightly negative
    CASE 
        WHEN interval_waitstats.interval_resource_wait_time < 0 THEN 0
        ELSE interval_waitstats.interval_resource_wait_time
    END AS interval_resource_wait_time, 
    interval_waitstats.interval_resource_signal_time, 
    CONVERT (bigint, interval_waitstats.interval_resource_wait_time / @interval_sec) AS interval_wait_time_per_sec, 
    CONVERT (numeric (10, 2), interval_waitstats.interval_resource_wait_time / @interval_sec / 1000) AS interval_avg_waiter_count, 
    interval_waitstats.resource_wait_time_cumulative, 
    CASE 
        WHEN previous_interval_waitstats.weighted_average_wait_time_per_sec IS NULL 
        THEN CONVERT (bigint, interval_waitstats.interval_resource_wait_time / @interval_sec)
        ELSE
            -- weighted average formula -- see comment above for explanation
            ((1-@current_interval_weight) * previous_interval_waitstats.weighted_average_wait_time_per_sec) 
                + (@current_interval_weight * CONVERT (bigint, interval_waitstats.interval_resource_wait_time / @interval_sec))
    END AS weighted_average_wait_time_per_sec
FROM interval_waitstats 
LEFT OUTER JOIN (
    SELECT * FROM #am_resource_mon_snap 
    WHERE current_snapshot_id = @previous_snapshot_id
) AS previous_interval_waitstats ON interval_waitstats.wait_type = previous_interval_waitstats.wait_type;

-- We can delete the data for the previous snapshot now
DELETE FROM #am_resource_mon_snap WHERE current_snapshot_id < @current_snapshot_id;
END

Run Code Online (Sandbox Code Playgroud)

特别是对于文件 I/O，它并没有那么简单和尝试。通过跟踪，您可以发现这些代码片段，但我还没有找到将它们联系在一起的原因，甚至还没有找到 #temp 表的定义位置 - 我怀疑一些数学和其他工作是在应用程序和/或我没有在跟踪中捕获的其他查询。不用说，这不仅仅是一个简单的 SQL 查询。

第 1 批：

SET NOCOUNT ON;

DECLARE @previous_collection_time datetime;
DECLARE @previous_total_io_mb numeric (28, 1);
DECLARE @current_collection_time datetime;
DECLARE @current_total_io_mb numeric (28, 1);
DECLARE @mb_per_sec numeric (20, 1);

-- Get the previous snapshot's total I/O
SELECT TOP 1 @previous_collection_time = collection_time, @previous_total_io_mb = total_io_bytes 
FROM #am_dbfileio
ORDER BY collection_time DESC;

-- Get the current total I/O.  
SET @current_collection_time = GETDATE();
SELECT @current_total_io_mb = SUM(num_of_bytes_read + num_of_bytes_written) / 1024.0 / 1024.0
FROM sys.dm_io_virtual_file_stats(default, default);

-- Calc the total I/O rate (MB/sec) for the just-completed time interval. 
-- Round values larger than 2MB/sec to the nearest MB.
SET @mb_per_sec = (@current_total_io_mb - @previous_total_io_mb) / DATEDIFF (millisecond, @previous_collection_time, @current_collection_time) * 1000;
IF @mb_per_sec > 2 
BEGIN
  SET @mb_per_sec = ROUND (@mb_per_sec, 0);
END; 

-- Save off current total I/O
INSERT INTO #am_dbfileio (collection_time, total_io_bytes) 
VALUES (@current_collection_time, @current_total_io_mb);

-- Return the I/O rate for the just-completed time interval. 
SELECT ISNULL (@mb_per_sec, 0) AS mb_per_sec;

-- Get rid of all but the most recent snapshot's data
DELETE FROM #am_dbfileio WHERE collection_time < @current_collection_time;

DECLARE @current_collection_time datetime;
SET @current_collection_time = GETDATE();

-- Grab a snapshot
INSERT INTO #am_dbfilestats
SELECT 
    @current_collection_time AS collection_time, 
    d.name AS [Database], 
    f.physical_name AS [File], 
    (fs.num_of_bytes_read / 1024.0 / 1024.0) [Total MB Read], 
    (fs.num_of_bytes_written / 1024.0 / 1024.0) AS [Total MB Written], 
    (fs.num_of_reads + fs.num_of_writes) AS [Total I/O Count], 
    fs.io_stall AS [Total I/O Wait Time (ms)], 
    fs.size_on_disk_bytes / 1024 / 1024 AS [Size (MB)]
FROM sys.dm_io_virtual_file_stats(default, default) AS fs
INNER JOIN sys.master_files f ON fs.database_id = f.database_id AND fs.file_id = f.file_id
INNER JOIN sys.databases d ON d.database_id = fs.database_id; 

-- Get the timestamp of the previous collection time
DECLARE @previous_collection_time datetime;
SELECT TOP 1 @previous_collection_time = collection_time 
FROM #am_dbfilestats 
WHERE collection_time < @current_collection_time
ORDER BY collection_time DESC;

DECLARE @interval_ms int;
SET @interval_ms = DATEDIFF (millisecond, @previous_collection_time, @current_collection_time); 

-- Return the diff of this snapshot and last
SELECT 
    cur.[Database], 
    cur.[File] AS [File Name], 
    CONVERT (numeric(28,1), (cur.[Total MB Read] - prev.[Total MB Read]) * 1000 / @interval_ms) AS [MB/sec Read], 
    CONVERT (numeric(28,1), (cur.[Total MB Written] - prev.[Total MB Written]) * 1000 / @interval_ms) AS [MB/sec Written], 
    -- protect from div-by-zero
    CASE 
        WHEN (cur.[Total I/O Count] - prev.[Total I/O Count]) = 0 THEN 0
        ELSE
            (cur.[Total I/O Wait Time (ms)] - prev.[Total I/O Wait Time (ms)]) 
                / (cur.[Total I/O Count] - prev.[Total I/O Count])
    END AS [Response Time (ms)]
FROM #am_dbfilestats AS cur
INNER JOIN #am_dbfilestats AS prev ON prev.[Database] = cur.[Database] AND prev.[File] = cur.[File]
WHERE cur.collection_time = @current_collection_time 
    AND prev.collection_time = @previous_collection_time;

-- Delete the older snapshot
DELETE FROM #am_dbfilestats
WHERE collection_time != @current_collection_time;

Run Code Online (Sandbox Code Playgroud)

批次 #2：

DECLARE @current_collection_time datetime;
SET @current_collection_time = GETDATE();

-- Grab a snapshot
INSERT INTO #am_dbfilestats
SELECT 
    @current_collection_time AS collection_time, 
    d.name AS [Database], 
    f.physical_name AS [File], 
    (fs.num_of_bytes_read / 1024.0 / 1024.0) [Total MB Read], 
    (fs.num_of_bytes_written / 1024.0 / 1024.0) AS [Total MB Written], 
    (fs.num_of_reads + fs.num_of_writes) AS [Total I/O Count], 
    fs.io_stall AS [Total I/O Wait Time (ms)], 
    fs.size_on_disk_bytes / 1024 / 1024 AS [Size (MB)]
FROM sys.dm_io_virtual_file_stats(default, default) AS fs
INNER JOIN sys.master_files f ON fs.database_id = f.database_id AND fs.file_id = f.file_id
INNER JOIN sys.databases d ON d.database_id = fs.database_id; 

-- Get the timestamp of the previous collection time
DECLARE @previous_collection_time datetime;
SELECT TOP 1 @previous_collection_time = collection_time 
FROM #am_dbfilestats 
WHERE collection_time < @current_collection_time
ORDER BY collection_time DESC;

DECLARE @interval_ms int;
SET @interval_ms = DATEDIFF (millisecond, @previous_collection_time, @current_collection_time); 

-- Return the diff of this snapshot and last
SELECT 
    cur.[Database], 
    cur.[File] AS [File Name], 
    CONVERT (numeric(28,1), (cur.[Total MB Read] - prev.[Total MB Read]) * 1000 / @interval_ms) AS [MB/sec Read], 
    CONVERT (numeric(28,1), (cur.[Total MB Written] - prev.[Total MB Written]) * 1000 / @interval_ms) AS [MB/sec Written], 
    -- protect from div-by-zero
    CASE 
        WHEN (cur.[Total I/O Count] - prev.[Total I/O Count]) = 0 THEN 0
        ELSE
            (cur.[Total I/O Wait Time (ms)] - prev.[Total I/O Wait Time (ms)]) 
                / (cur.[Total I/O Count] - prev.[Total I/O Count])
    END AS [Response Time (ms)]
FROM #am_dbfilestats AS cur
INNER JOIN #am_dbfilestats AS prev ON prev.[Database] = cur.[Database] AND prev.[File] = cur.[File]
WHERE cur.collection_time = @current_collection_time 
    AND prev.collection_time = @previous_collection_time;

-- Delete the older snapshot
DELETE FROM #am_dbfilestats
WHERE collection_time != @current_collection_time;

Run Code Online (Sandbox Code Playgroud)

所以，如果你想从那里开始工作，这可能是一个很好的起点。就我个人而言，因为我知道已经有很多工具可以做我想做的事情，我的时间最好花在其他地方，而且我不想重新发明轮子。天啊。

如果您想找到峰值，那么您需要自己计算增量，例如将两个快照靠近并进行比较。这就是活动监视器的工作方式。 (2认同)
@Julien 是的，正如我上面所说明的，这些数字不是通过一些简单的 SQL 查询计算出来的。你可以花尽可能多的时间[重新发明轮子](http://thebertrandfamily.com/2012/04/20/re-blog-the-cost-of-reinventing-the-wheel/)，但是我相当有信心，这里没有人会为您在几个 Internet 点上完成所有这些工作 - 这将是一笔可观的投资。如果您想编写自己的查询来执行此操作，我建议您阅读我提供的所有资源，并根据自己的时间进行解决。 (2认同)

归档时间：	10 年，6 月前
查看次数：	4342 次
最近记录：	10 年，6 月前