事务复制延迟问题

Pat*_*Pat 7 sql-server sql-server-2008-r2 transactional-replication

我继承了这样的数据库系统。目前,我在 Windows Server 2008 R2 和 SQL Server 2008R2 SP2 机器上拥有 SQL Server 2005 兼容模式下的发布者数据库。分发服务器在同一台机器上。订阅服务器为 2008R2 SP2,数据库为 SQL Server 2008 兼容模式。我们正在使用事务复制。隔离级别为 Read Committed。分发服务器驻留在发布服务器上。即使当我右键单击发布并且订阅显示为请求订阅时,我认为这无关紧要,因为分发者驻留在发布者本身上。如果我错了,请纠正我。存储系统为IBM flex,由发布者和订阅者等五台服务器共享。

从几天开始,我看到几个小时的延迟,它在早上赶上并在下午再次开始。我跟着https://www.mssqltips.com/sqlservertip/3598/troubleshooting-transactional-replication-latency-issues-in-sql-server/看看到底发生了什么。我运行了以下查询。

USE distribution 
go 
EXEC Sp_browsereplcmds 
@xact_seqno_start = '<seq#>' -- seq# is same for start and end
,@xact_seqno_end = '<seq#>'
,@publisher_database_id = <publisher database id --this is different than database_id
Run Code Online (Sandbox Code Playgroud)

我看到据称在复制涉及的几个表上进行了大量更新,而日志读取器只是扫描事务日志,在事务完成之前无法复制任何内容。有趣的是,我看不到发布者和/或订阅者的任何阻塞。将隔离级别更改为 Read Committed Snapshot Isolation (RCSI) 会有帮助吗?将轮询间隔更改为 1 并将 readbatchsize 更改为 1000 或 5000 是否有帮助。更改该设置的命令是什么?

我更改了日志阅读器代理默认配置文件如下。轮询间隔从 5 变为 1,ReadBatchSize 变为 5000。这几乎立即将延迟从 13 小时变为零。但我看到它又回到了 13 小时。

复制是同步的,我对导致延迟的实际根本原因没有任何线索,现在它消失了。

Pat*_*Pat 7

我最终不得不致电 Microsoft 支持部门,并且仅在发布者上执行了一个名为 DBCC LOG INFO 的简单命令就揭示了可能的根本原因。我看到了 8600 多个 VLF!这就是延迟的原因。此外,我们的日志文件预先分配到 538GB。

第二天下午 4:00 与 Microsoft 开立案例后,当我接到 Microsoft 帮助的跟进电话时,复制已不同步将近 19 小时。要采取的步骤非常简单。备份发布者数据库日志几次并尝试缩小日志文件。将日志文件的增量因子设置为 8GB 或 12GB,而不是百分比或 500MB。因此,下次日志文件增长时,它将根据您的增量因子每 8GB 或 12GB 创建 16 个 VLF。

备份日志后,我能够将日志文件缩小到 350GB,将 VLF 总数缩小到 5300 左右。仍然更高。但是延迟并没有降低。它长达22小时。我开始怀疑 VLF 的数量是否只是原因之一。然而,在晚上 11 点 30 分左右,延迟减少到大约 7 点 30 分,我在那个时候释放了更多空间,将 VLF 减少到 2001。到凌晨 2 点,复制同步。我赶紧备份了两次日志,然后将日志文件缩小到 10GB,然后又增加到 248GB 左右。截至目前,VLF 总数为 184,自那时起复制处于同步状态。哇!日志文件几乎是空的。

如果您对此有任何疑问,请告诉我。我很乐意提供帮助。希望其他人不必为此问题致电 Microsoft。


Kin*_*hah 6

将隔离级别更改为 Read Committed Snapshot Isolation (RCSI) 会有帮助吗?

这不是一个直接的更改,它带有额外的 tempdb 惩罚。我不建议您在没有正确测试并看到您的环境中的好处的情况下将隔离级别更改为 RCSI。相信我,这是一种大锤方法。

我们最近遇到了同样的问题

在复制涉及的几个表上进行大量更新,日志读取器只是扫描事务日志

以下是我解决问题的方法:

  • 使文章复制为 BATCHED (此更改是动态的,不需要重新初始化)

    • 这可以通过右键单击发布 --> 生成脚本 --> 检查@status值来完成。任何小于 16 的值表示它被设置为使用 TSQL ==> NOT Batched 复制!
    • 即使 1 篇文章未设置为 BATCH,在对订阅者应用更改时,也不会将任何文章设置为 BATCHED。

    • 在 TSQL 下使用

          EXEC sp_changearticle @publication = N'<pub name>'  -- your publication Name
                              , @article = N'<article name>'  -- your article Name
                              , @property = 'status'
                              ,  @value = 'parameters'
      
      Run Code Online (Sandbox Code Playgroud)
  • 在分发数据库上创建了一个非聚集索引:

    USE [distribution]
    GO
    CREATE NONCLUSTERED INDEX [nc_MSrepl_commands_DBA]
    ON [dbo].[MSrepl_commands] ([publisher_database_id],[article_id],[xact_seqno])
    INCLUDE ([type],[originator_id])
    GO
    
    Run Code Online (Sandbox Code Playgroud)

有关更多高级调整,您可以参考增强事务复制性能,尤其是分发代理和日志读取器代理参数。

下面是我用来查找 T-Rep 复制状态的脚本:

USE [distribution]
-- Ref: http://www.sqlservercentral.com/blogs/basits-sql-server-tips/2012/07/25/t-sql-script-to-monitor-transactional-replication-status/

IF OBJECT_ID('Tempdb.dbo.#ReplStats') IS NOT NULL  
    DROP TABLE #ReplStats 

CREATE TABLE [dbo].[#ReplStats](
    [DistributionAgentName] [nvarchar](100) NOT NULL,
    [DistributionAgentStartTime] [datetime] NOT NULL,
    [DistributionAgentRunningDurationInSeconds] [int] NOT NULL,
    [IsAgentRunning] [bit] NULL,
    [ReplicationStatus] [varchar](14) NULL,
    [LastSynchronized] [datetime] NOT NULL,
    [Comments] [nvarchar](max) NOT NULL,
    [Publisher] [sysname] NOT NULL,
    [PublicationName] [sysname] NOT NULL,
    [PublisherDB] [sysname] NOT NULL,
    [Subscriber] [nvarchar](128) NULL,
    [SubscriberDB] [sysname] NULL,
    [SubscriptionType] [varchar](64) NULL,
    [DistributionDB] [sysname] NULL,
    [Article] [sysname] NOT NULL,
    [UndelivCmdsInDistDB] [int] NULL,
    [DelivCmdsInDistDB] [int] NULL,
    [CurrentSessionDeliveryRate] [float] NOT NULL,
    [CurrentSessionDeliveryLatency] [int] NOT NULL,
    [TotalTransactionsDeliveredInCurrentSession] [int] NOT NULL,
    [TotalCommandsDeliveredInCurrentSession] [int] NOT NULL,
    [AverageCommandsDeliveredInCurrentSession] [int] NOT NULL,
    [DeliveryRate] [float] NOT NULL,
    [DeliveryLatency] [int] NOT NULL,
    [TotalCommandsDeliveredSinceSubscriptionSetup] [int] NOT NULL,
    [SequenceNumber] [varbinary](16) NULL,
    [LastDistributerSync] [datetime] NULL,
    [Retention] [int] NULL,
    [WorstLatency] [int] NULL,
    [BestLatency] [int] NULL,
    [AverageLatency] [int] NULL,
    [CurrentLatency] [int] NULL
) ON [PRIMARY]


INSERT INTO #ReplStats 
SELECT da.[name] AS [DistributionAgentName]
      ,dh.[start_time] AS [DistributionAgentStartTime]
      ,dh.[duration] AS [DistributionAgentRunningDurationInSeconds]
      ,md.[isagentrunningnow] AS [IsAgentRunning]
      ,CASE md.[status]
        WHEN 1 THEN '1 - Started'
        WHEN 2 THEN '2 - Succeeded'
        WHEN 3 THEN '3 - InProgress'
        WHEN 4 THEN '4 - Idle'
        WHEN 5 THEN '5 - Retrying'
        WHEN 6 THEN '6 - Failed'
       END AS [ReplicationStatus]
      ,dh.[time] AS [LastSynchronized]
      ,dh.[comments] AS [Comments]
      ,md.[publisher] AS [Publisher]
      ,da.[publication] AS [PublicationName]
      ,da.[publisher_db] AS [PublisherDB]
      ,CASE 
         WHEN da.[anonymous_subid] IS NOT NULL 
            THEN UPPER(da.[subscriber_name])
       ELSE UPPER (s.[name]) END AS [Subscriber]
      ,da.[subscriber_db] AS [SubscriberDB]
      ,CASE da.[subscription_type]
        WHEN '0' THEN 'Push'  
        WHEN '1' THEN 'Pull'  
        WHEN '2' THEN 'Anonymous'  
       ELSE CAST(da.[subscription_type] AS [varchar](64)) END AS [SubscriptionType]
      ,md.[distdb] AS [DistributionDB]
      ,ma.[article]    AS [Article]
      ,ds.[UndelivCmdsInDistDB] 
      ,ds.[DelivCmdsInDistDB]
      ,dh.[current_delivery_rate] AS [CurrentSessionDeliveryRate]
      ,dh.[current_delivery_latency] AS [CurrentSessionDeliveryLatency]
      ,dh.[delivered_transactions] AS [TotalTransactionsDeliveredInCurrentSession]
      ,dh.[delivered_commands] AS [TotalCommandsDeliveredInCurrentSession]
      ,dh.[average_commands] AS [AverageCommandsDeliveredInCurrentSession]
      ,dh.[delivery_rate] AS [DeliveryRate]
      ,dh.[delivery_latency] AS [DeliveryLatency]
      ,dh.[total_delivered_commands] AS [TotalCommandsDeliveredSinceSubscriptionSetup]
      ,dh.[xact_seqno] AS [SequenceNumber]
      ,md.[last_distsync] AS [LastDistributerSync]
      ,md.[retention] AS [Retention]
      ,md.[worst_latency] AS [WorstLatency]
      ,md.[best_latency] AS [BestLatency]
      ,md.[avg_latency] AS [AverageLatency]
      ,md.[cur_latency] AS [CurrentLatency]
FROM [distribution]..[MSdistribution_status] ds 
INNER JOIN [distribution]..[MSdistribution_agents] da
    ON da.[id] = ds.[agent_id]                          
INNER JOIN [distribution]..[MSArticles] ma 
    ON ma.publisher_id = da.publisher_id 
        AND ma.[article_id] = ds.[article_id]
INNER JOIN [distribution]..[MSreplication_monitordata] md
    ON [md].[job_id] = da.[job_id]
INNER JOIN [distribution]..[MSdistribution_history] dh
    ON [dh].[agent_id] = md.[agent_id] 
        AND md.[agent_type] = 3
INNER JOIN [master].[sys].[servers]  s
    ON s.[server_id] = da.[subscriber_id] 
--Created WHEN your publication has the immediate_sync property set to true. This property dictates 
--whether snapshot is available all the time for new subscriptions to be initialized. 
--This affects the cleanup behavior of transactional replication. If this property is set to true, 
--the transactions will be retained for max retention period instead of it getting cleaned up
--as soon as all the subscriptions got the change. 
WHERE da.[subscriber_db] <> 'virtual' 
    AND da.[anonymous_subid] IS NULL
    AND dh.[start_time] = (SELECT TOP 1 start_time
                            FROM [distribution]..[MSdistribution_history] a
                            JOIN [distribution]..[MSdistribution_agents] b
                            ON a.[agent_id] = b.[id] AND b.[subscriber_db] <> 'virtual'
                            WHERE [runstatus] <> 1
                            ORDER BY [start_time] DESC)
    AND dh.[runstatus] <> 1

SELECT 'Transactional Replication Summary' AS [Comments];
SELECT [DistributionAgentName]
      ,[DistributionAgentStartTime]
      ,[DistributionAgentRunningDurationInSeconds]
      ,[IsAgentRunning]
      ,[ReplicationStatus]
      ,[LastSynchronized]
      ,[Comments]
      ,[Publisher]
      ,[PublicationName]
      ,[PublisherDB]
      ,[Subscriber]
      ,[SubscriberDB]
      ,[SubscriptionType]
      ,[DistributionDB]
      ,SUM([UndelivCmdsInDistDB]) AS [UndelivCmdsInDistDB]
      ,SUM([DelivCmdsInDistDB]) AS [DelivCmdsInDistDB]
      ,[CurrentSessionDeliveryRate]
      ,[CurrentSessionDeliveryLatency]
      ,[TotalTransactionsDeliveredInCurrentSession]
      ,[TotalCommandsDeliveredInCurrentSession]
      ,[AverageCommandsDeliveredInCurrentSession]
      ,[DeliveryRate]
      ,[DeliveryLatency]
      ,[TotalCommandsDeliveredSinceSubscriptionSetup]
      ,[SequenceNumber]
      ,[LastDistributerSync]
      ,[Retention]
      ,[WorstLatency]
      ,[BestLatency]
      ,[AverageLatency]
      ,[CurrentLatency]
FROM #ReplStats
GROUP BY [DistributionAgentName]
        ,[DistributionAgentStartTime]
        ,[DistributionAgentRunningDurationInSeconds]
        ,[IsAgentRunning]
        ,[ReplicationStatus]
        ,[LastSynchronized]
        ,[Comments]
        ,[Publisher]
        ,[PublicationName]
        ,[PublisherDB]
        ,[Subscriber]
        ,[SubscriberDB]
        ,[SubscriptionType]
        ,[DistributionDB]
        ,[CurrentSessionDeliveryRate]
        ,[CurrentSessionDeliveryLatency]
        ,[TotalTransactionsDeliveredInCurrentSession]
        ,[TotalCommandsDeliveredInCurrentSession]
        ,[AverageCommandsDeliveredInCurrentSession]
        ,[DeliveryRate]
        ,[DeliveryLatency]
        ,[TotalCommandsDeliveredSinceSubscriptionSetup]
        ,[SequenceNumber]
        ,[LastDistributerSync]
        ,[Retention]
        ,[WorstLatency]
        ,[BestLatency]
        ,[AverageLatency]
        ,[CurrentLatency]

SELECT 'Transactional Replication Summary Details' AS [Comments];
SELECT [Publisher]
      ,[PublicationName]
      ,[PublisherDB]
      ,[Article]
      ,[Subscriber]
      ,[SubscriberDB]
      ,[SubscriptionType]
      ,[DistributionDB]
      ,SUM([UndelivCmdsInDistDB]) AS [UndelivCmdsInDistDB]
      ,SUM([DelivCmdsInDistDB]) AS [DelivCmdsInDistDB]
FROM #ReplStats
GROUP BY [Publisher]
        ,[PublicationName]
        ,[PublisherDB]
        ,[Article]
        ,[Subscriber]
        ,[SubscriberDB]
        ,[SubscriptionType]
        ,[DistributionDB]
Run Code Online (Sandbox Code Playgroud)