如何自动错开事务日志传送以最小化带宽使用高峰?

Lar*_*man 6 sql-server-2008 sql-server backup

以下问题与 Microsoft SQL Server 事务日志传送 (TLS) 有关。

我们使用的是 SQL Server 2008 R2 SP1,尽管这个问题可能与所有最新版本有关。

背景

我有一个主数据中心 (A) 和一个辅助的灾难恢复数据中心 (B)。假设我有 64 个数据库需要进行不同大小的日志传送,但没有一个数据库超过 50GB。

我目前正在使用 SQL Server 自动创建的默认 SQL 代理作业,因此我有 64 个 SQL 代理作业。

我想每 15 分钟记录一次每个数据库。假设如果我只是背对背传输所有文件,那是可能的。

问题

假设从 A 到 B 的路径通过公共 Internet 传输,因此我要为传输这些日志的带宽付费。我在第95 个百分位计量。我想平滑或优化带宽,以尽量减少支付超额的机会。我正在使用备份压缩。

可能的解决方案

我目前的想法是编写一个脚本,该脚本将自动自定义每个备份作业的开始时间。该脚本可以根据需要频繁运行以保持最佳传输。让它定期运行将使新添加的数据库的新日志传送备份作业能够在无需人工干预的情况下进行优化。

当前的 SQL 代理作业都以字符串“LSBackup”开头,因此我可以使用以下方法列出它们:

SELECT * FROM msdb..sysschedules WHERE name LIKE 'LSBackup%'
Run Code Online (Sandbox Code Playgroud)

我可以获取所有 SQL 代理作业的列表并将它们存储在表变量中,然后在WHILE循环调用中迭代以适当EXEC msdb.dbo.sp_update_schedule地更新@active_start_time作业的空间。

我应该为每个作业的开始时间使用什么值?

为了回答这个问题,我需要知道日志备份文件可能有多大,以便最佳地分配作业。在运行备份之前,您能预测事务日志备份的大小吗?或者,我可以查看本地文件系统上过去几天的日志备份,以确定每个数据库的相对权重。然而,对于几乎没有事务日志备份历史的全新数据库来说,这并不真正有效。

假设可以确定每个备份的相对大小,那么在第 95 个百分位计量时将它们隔开以实现最小带宽影响的最佳算法是什么?

Lar*_*man 2

我最终编写了一个存储过程,每晚通过 SQL 代理运行该过程。

IF EXISTS (SELECT name FROM sysobjects WHERE name = 'dba_StaggerLogShippingJobs' AND type = 'P') DROP PROCEDURE dba_StaggerLogShippingJobs
GO

CREATE PROCEDURE dbo.dba_StaggerLogShippingJobs
AS
-- This job is intended to be run nightly.
-- It queries the msdb..sysschedules table for jobs that start with 'LSBackupSchedule%'
-- It determines how to space the jobs evenly within a 15 minute window, then calls msdb..sp_update_schedule to set the new @active_start_time. 

SET NOCOUNT ON

declare @logShipEvery int = 900 -- Log ship every 900 seconds or 15 minutes
declare @staggerSeconds int -- number of seconds between jobs
declare @new_active_start_time int -- calculated new start time for a given job
declare @current_active_start_time int -- existing start time for a given job.

-- Some simple variables for use in the loop.
declare @i int = 0
declare @maxId int
declare @schedule_id int

-- table variable in which we store all the current log shipping jobs and their current active_start_time
declare @sqlAgentSchedules table (id int identity (0,1) primary key, schedule_id int not null, current_active_start_time int, new_active_start_time int)

-- Fetch all the LSBackupSchedule jobs into a table variable.
-- Order the query by schedule_id, which monotomically increases as new jobs are added, so that we 
-- can do some tricks to make sure we're not unnecessarily updating schedules nightly
-- when the new value would be equal to the old value.
insert into @sqlAgentSchedules (schedule_id, current_active_start_time)
select schedule_id, active_start_time from msdb..sysschedules 
where name like 'LSBackupSchedule%'
order by schedule_id

select @maxId = (select MAX(id) from @sqlAgentSchedules)

select @staggerSeconds = @logShipEvery / @maxId

-- Calculate a new staggered active_start_time for each job
-- and store it back in our table variable.
-- Formatting the active_start_time is a little tricky because MS is using an integer in a weird
-- way to represent a time value.  For example, the integer value 235959 is actually 23:59:59, or a second before midnight.
-- Took some shortcuts here with the assumption that our time values will always lie between 0 and 15 minutes.
-- In other words, I only bothered to zero-pad the seconds value, and not the minutes value, and I completely
-- ignored the possibility of needing to ever set the hours value.
update @sqlAgentSchedules set
 new_active_start_time = convert(int,convert(varchar(2),@staggerSeconds * id / 60) + right('0'+convert(varchar(2),@staggerSeconds * id % 60),2))

-- Loop over each row of our table variable and update the active_start_time 
-- for any log shipping job that needs to be updated.
while @i <= @maxId
begin
    -- Get the values from this row.
    select 
        @schedule_id = schedule_id, 
        @current_active_start_time = current_active_start_time,
        @new_active_start_time = new_active_start_time
    from @sqlAgentSchedules
    where id = @i

    -- Only update the job schedule if we'd be making a change to its existing active_start_time.
    if @new_active_start_time <> @current_active_start_time
    begin
        exec msdb..sp_update_schedule @schedule_id = @schedule_id, @active_start_time = @new_active_start_time
        print 'Updating schedule for job ' + convert(varchar(8),@schedule_id) + '. Old active_start_time: ' + convert(varchar(8),@current_active_start_time) + '. New active_start_time: ' + convert(varchar(8),@new_active_start_time)
    end

    select @i = @i + 1
end

SET NOCOUNT OFF
Run Code Online (Sandbox Code Playgroud)