查找系统运行状况文件何时滚动的方法

Beg*_*DBA 4 sql-server extended-events sql-server-2016

有没有办法在系统健康扩展事件文件滚动时找到事件,而不是手动监视事件?

对于我的中等负载服务器,它们最多可停留 2-3 天。但是对于负载较重的服务器,这些文件每 15 分钟左右滚动一次,但没有固定的模式或时间。我们知道原因并努力过滤掉不需要的事件或报告为问题的事件。

我很好奇是否有一种方法可以查询文件翻转发生的时间。我也没有在 MS 文档上看到太多文档,但找不到此信息?

请建议它是否可能以及如何?

Jos*_*ell 6

您可以system_health像这样获取所有事件文件及其最旧的事件:

DECLARE @file_name AS nvarchar(max);
DECLARE @file_path AS nvarchar(max);

SELECT 
    @file_name = 
        CAST(st.target_data AS xml).value(
            N'(EventFileTarget/File/@name)[1]', N'nvarchar(max)')
FROM sys.dm_xe_sessions s
    INNER JOIN sys.dm_xe_session_targets st
        ON s.[address] = st.event_session_address
WHERE 
    st.target_name = 'event_file'
    AND s.[name] = 'system_health';


SELECT @file_path = LEFT(
    @file_name,
    LEN(@file_name) - CHARINDEX('\', REVERSE(@file_name)) + 1);

SELECT
    files.[file_name],
    MIN(CAST(files.event_data AS XML).value(N'(event/@timestamp)[1]', N'datetime')) AS oldest_event
FROM sys.fn_xe_file_target_read_file
(
    @file_path + 'system_health*',
    null, null, null
) files
GROUP BY files.[file_name]
OPTION(NO_PERFORMANCE_SPOOL, QUERYTRACEON 8649);
Run Code Online (Sandbox Code Playgroud)

SSMS 结果的屏幕截图

注意:对于不支持NO_PERFORMANCE_SPOOL查询提示的 SQL Server 版本(SQL Server 2016 之前),您可以将其替换为QUERYTRACEON 8690(有关详细信息,请参阅假脱机运算符和跟踪标志 8690)。

Erik Darling提出建议查询提示的帽子提示,这在我的测试中显着加快了速度

该查询返回的日期/时间采用 UTC。您可以使用这样的方法来转换为服务器本地时间:

SELECT
    files.[file_name],
    MIN(CAST(files.event_data AS XML).value(N'(event/@timestamp)[1]', N'datetime')) AS oldest_event_utc,
    SWITCHOFFSET
    (
        MIN(CAST(files.event_data AS XML).value(N'(event/@timestamp)[1]', N'datetimeoffset')), 
        DATENAME(TzOffset, SYSDATETIMEOFFSET())
    ) AS oldest_event
Run Code Online (Sandbox Code Playgroud)

上面的代码


因此,实现目标的一种方法是将该查询作为计划的代理作业运行,并将结果记录到表中。然后,您将能够看到每个文件的“最旧事件”何时更改(也就是文件翻转时)。

回想一下,文件可能由于多种不同的原因而翻转。

如果性能是一个问题,并且您对 PowerShell 感到满意,那么使用Dan Guzman 在这里提供方法可能会好得多


Dan*_*man 6

下面是一个 powershell 示例,它使用新的 Microsoft [SqlServer.XEvent PowerShell 模块][1] 按名称汇总来自本地 system_health 目标文件的事件。我发现使用 .NET/PowerShell 处理大量事件比在 T-SQL 中解析 XML 要快得多。您可以将其安排为 SQL 代理作业,以确定是什么驱动了事件活动,并在需要时采取纠正措施。

# Install the SqlServer.XEvent module from an admin PowerShell session before running this script:
# Install-Module -Name SqlServer.XEvent

# get list of system_health trace files
Function Get-XeFiles() {
    $connectionString = "Data Source=.;Initial Catalog=tempdb;Integrated Security=SSPI";
    $connection = New-Object System.Data.SqlClient.SqlConnection($connectionString);
    $connection.Open();
$query = @"
WITH
      --get full path to current system_health trace file
      CurrentSystemHealthTraceFile AS (
        SELECT CAST(target_data AS xml).value('(/EventFileTarget/File/@name)[1]', 'varchar(255)') AS FileName
        FROM sys.dm_xe_session_targets
        WHERE
            target_name = 'event_file'
            AND CAST(target_data AS xml).value('(/EventFileTarget/File/@name)[1]', 'varchar(255)') LIKE '%\system[_]health%'
    )
      --get system_health trace folder 
    , TraceDirectory AS (
        SELECT 
            REVERSE(SUBSTRING(REVERSE(FileName), CHARINDEX(N'\', REVERSE(FileName)), 255)) AS TraceDirectoryPath
        FROM CurrentSystemHealthTraceFile
        )
SELECT TraceDirectoryPath
FROM TraceDirectory;
"@

    $command = New-Object System.Data.SqlClient.SqlCommand($query, $connection)
    $traceFileDirectory = $command.ExecuteScalar()
    $connection.Close()

    $xe_files = Get-Item "$($traceFileDirectory)system_health_*.xel"
    return $xe_files

}

try {

    $xe_files = Get-XeFiles
    foreach($xe_file in $xe_files) {
        try{
            # summary of events by event_name for each file
            $events = Read-SqlXEvent -FileName $xe_file.FullName
            Write-Host "Summary for file $($xe_file.FullName)"
            $events | Group-Object -Property Name -NoElement | Format-Table -AutoSize
        }
        catch {
            if(($_.Exception.GetType().Name -eq "AggregateException") -and ($_.Exception.InnerException -ne $null) -and ($_.Exception.InnerException.GetType().Name -eq "IOException")) {
                # ignore error due to active trace file
                Write-Host "$($_.Exception.InnerException.Message)"
            }
            else {
                # rethrow other errors
                throw
            }
        }
    }

}
catch {
    throw
}
Run Code Online (Sandbox Code Playgroud)

示例输出:

Summary for file D:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\Log\system_health_0_132057213063750000.xel

Count Name                                                
----- ----                                                
  860 sp_server_diagnostics_component_result              
 1072 scheduler_monitor_system_health_ring_buffer_recorded
    2 connectivity_ring_buffer_recorded                   
    1 security_error_ring_buffer_recorded         

Summary for file D:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\Log\system_health_0_132057856050380000.xel

Count Name                                                
----- ----                                                
 1312 sp_server_diagnostics_component_result              
 1644 scheduler_monitor_system_health_ring_buffer_recorded
   28 scheduler_monitor_non_yielding_ring_buffer_recorded 
    4 connectivity_ring_buffer_recorded                   
    2 error_reported                                      
    2 wait_info                                           
    6 security_error_ring_buffer_recorded     
Run Code Online (Sandbox Code Playgroud)

编辑:

只要跟踪文件夹可通过共享远程访问,这也可以使用单个脚本远程完成并针对多个服务器。下面的示例针对列表中的每个服务器运行,并使用 UNC 路径访问跟踪文件。由于此版本使用驱动器号管理共享,因此它必须在对远程框具有 Windows 管理员权限的 Windows 帐户下运行。如果您在每台服务器上创建共享并改用该共享名,则可以使用权限较低的帐户。

# get list of system_health trace files with admin share UNC path
Function Get-XeFiles($serverName) {
    $connectionString = "Data Source=$serverName;Initial Catalog=tempdb;Integrated Security=SSPI";
    $connection = New-Object System.Data.SqlClient.SqlConnection($connectionString);
    $connection.Open();
$query = @"
WITH
      --get full path to current system_health trace file
      CurrentSystemHealthTraceFile AS (
        SELECT CAST(target_data AS xml).value('(/EventFileTarget/File/@name)[1]', 'varchar(255)') AS FileName
        FROM sys.dm_xe_session_targets
        WHERE
            target_name = 'event_file'
            AND CAST(target_data AS xml).value('(/EventFileTarget/File/@name)[1]', 'varchar(255)') LIKE '%\system[_]health%'
    )
      --get system_health trace folder 
    , TraceDirectory AS (
        SELECT 
            REVERSE(SUBSTRING(REVERSE(FileName), CHARINDEX(N'\', REVERSE(FileName)), 255)) AS TraceDirectoryPath
        FROM CurrentSystemHealthTraceFile
        )
SELECT TraceDirectoryPath
FROM TraceDirectory;
"@

    $command = New-Object System.Data.SqlClient.SqlCommand($query, $connection)
    $traceFileDirectory = $command.ExecuteScalar()
    # change driver letter to admin share UNC path (e.g. "D:\" to "\\servername\d$")
    $traceFileDirectory = "\\$serverName\$($traceFileDirectory.Replace(":", "$"))"
    $connection.Close()

    $xe_files = Get-Item "$($traceFileDirectory)system_health_*.xel"
    return $xe_files

}

# specify list of servers here
$serverList = @(
     "YourServer1"
    ,"YourServer2"
    ,"YourServer3"
)

try {
    foreach($server in $serverList) {

        $xe_files = Get-XeFiles -serverName $server
        foreach($xe_file in $xe_files) {
            try{
                # summary of events by event_name for each file
                $events = Read-SqlXEvent -FileName $xe_file.FullName
                Write-Host "Summary for file $($xe_file.FullName)"
                $events | Group-Object -Property Name -NoElement | Format-Table -AutoSize
            }
            catch {
                if(($_.Exception.GetType().Name -eq "AggregateException") -and ($_.Exception.InnerException -ne $null) -and ($_.Exception.InnerException.GetType().Name -eq "IOException")) {
                    # ignore error due to active trace file
                    Write-Host "$($_.Exception.InnerException.Message)"
                }
                else {
                    # rethrow other errors
                    throw
                }
            }
        }
    }

}
catch {
    throw
}
Run Code Online (Sandbox Code Playgroud)