数据加载后创建索引 Vs 大表数据加载前

Aji*_*ari 4 index sql-server ssis

我在一台服务器 A 上有一个大小为 430 GB 的巨大 sql server 表。

我需要将表原样复制到新服务器 B,因此我在服务器 B 上创建了一个空表,并使用 SSIS 包/作业将所有数据从 A 移动到 B - 这花了我将近 30 多个小时。

现在数据拷贝已经完成,最大的任务是建立索引。

我采用这种方法是否正确,还是应该将数据加载到服务器 B 上已建立索引的表中?

- - - - - - - - - - - - - - -桌子 - - - - - - - -

CREATE TABLE [CCTSwapForward].[SWAP_DATA](
    [SwapData_ID] [int] NOT NULL,
    [Contract_ID] [varchar](20) NOT NULL,
    [DataSourceName] [varchar](10) NULL,
    [Direction] [varchar](5) NULL,
    [StatusDescription] [varchar](50) NULL,
    [TargetCurrencyCode] [varchar](5) NULL,
    [SettlementCurrencyCode] [varchar](5) NULL,
    [TargetAmount] [float] NULL,
    [SettlementAmount] [float] NULL,
    [ConfirmationNo] [varchar](20) NULL,
    [ItemNo] [int] NULL,
    [ExpiryDate] [datetime] NULL,
    [Original_TargetAmount] [float] NULL,
    [Original_SettlementAmount] [float] NULL,
    [PartyID] [varchar](20) NULL,
    [PrimaryAssetClass] [varchar](50) NULL,
    [SecondaryAssetClass] [varchar](5) NULL,
    [EffectiveDate] [datetime] NULL,
    [PriceNotationType] [varchar](5) NULL,
    [PriceNotationValue] [varchar](5) NULL,
    [Addn_PriceNotationType] [varchar](5) NULL,
    [Addn_PriceNotationValue] [varchar](5) NULL,
    [ProductIDPrefix] [varchar](5) NULL,
    [ProductIDValue] [varchar](50) NULL,
    [AllocationIndicator] [varchar](20) NULL,
    [ExecutionTS] [datetime] NULL,
    [VerificationType] [varchar](50) NULL,
    [ExecutionVenuePrefix] [varchar](10) NULL,
    [ExecutionVenue] [varchar](50) NULL,
    [ClearingDCOValue] [varchar](50) NULL,
    [Collateralized] [varchar](25) NULL,
    [LastReporteddt] [datetime] NULL,
    [Confirmationdt] [datetime] NULL,
    [ConfirmationType] [varchar](25) NULL,
    [Valuationdt] [datetime] NULL,
    [MTMValue] [float] NULL,
    [MTMCurrency] [varchar](3) NULL,
    [ReportingJurisdiction] [varchar](15) NULL,
    [ValueDate] [datetime] NULL,
    [ExchangeRate] [float] NULL,
    [TradePartyRole] [varchar](50) NULL,
    [TradePartyPrefix] [varchar](50) NULL,
    [TradePartyValue] [varchar](20) NULL,
    [ReportingObligation] [varchar](15) NULL,
    [USPersonIndicator] [bit] NULL,
    [FinEntityIndicator] [bit] NULL,
    [ClientOrder_ID] [int] NULL,
    [OrderDetail_ID] [int] NULL,
    [Batch_ID] [int] NOT NULL,
    [Status_ID] [int] NULL,
    [initdt] [datetime] NULL,
    [initid] [int] NULL,
    [upddt] [datetime] NULL,
    [updid] [int] NULL,
    [ReportingPartyLEI] [varchar](20) NULL,
    [ProcessCenter] [varchar](50) NULL,
    [TargetAmount_NDec] [int] NULL,
    [SettlementAmount_NDec] [int] NULL,
    [ExchangeRate_NDec] [int] NULL,
    [MTMRate] [float] NULL,
    [LastUpdatedTS] [datetime] NULL,
    [Client_ID] [int] NULL,
    [Office_ID] [int] NULL,
    [IsInterafiliate] [bit] NULL,
    [TreasuryToBranchRate] [float] NULL,
    [ActualValueDate] [datetime] NULL,
    [SpreadRevenueInSettlementCurrency] [float] NULL,
    [SettlementToUSDRate] [float] NULL,
    [ReportingCurrencyToUSDRate] [float] NULL,
    [TradeCurrencyToUSDRate] [float] NULL,
    [BranchCurrency] [varchar](3) NULL,
    [FirstTradeDate] [datetime] NULL,
    [Action] [varchar](10) NULL,
    [TransactionType] [varchar](25) NULL,
    [LifeCycleEvent] [varchar](25) NULL,
    [TPDomicile] [varchar](200) NULL,
    [TP1Branch] [varchar](50) NULL,
    [TP2FinancialJurisdiction] [varchar](10) NULL,
    [TP2NonFinancialJurisdiction] [varchar](10) NULL,
    [MasterAgreementDate] [datetime] NULL,
    [ReportingDelegation_ID] [int] NULL,
    [RelatedClientOrder_ID] [int] NULL,
    [RelatedOrderDetail_ID] [int] NULL,
    [ReportingDelegationModel] [nvarchar](100) NULL,
    [ExtractDatetimeUTC] [datetime] NOT NULL,
    [IsNDF] [bit] NULL,
    [FixingDate] [datetime] NULL,
    [SettlementExchangeBasis] [varchar](7) NULL
) ON [PRIMARY]

--------------------INDEX---------------------------



USE [ODS]
GO

CREATE UNIQUE NONCLUSTERED INDEX [idx_Batch_ID_SwapData_ID] ON [CCTSwapForward].[SWAP_DATA]
(
    [Batch_ID] ASC,
    [SwapData_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO


USE [ODS]
GO

CREATE UNIQUE NONCLUSTERED INDEX [idx_Contract_ID_SwapData_ID] ON [CCTSwapForward].[SWAP_DATA]
(
    [Contract_ID] ASC,
    [SwapData_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO


USE [ODS]
GO

CREATE UNIQUE CLUSTERED INDEX [idx_SWAPDATA_ID] ON [CCTSwapForward].[SWAP_DATA]
(
    [SwapData_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Run Code Online (Sandbox Code Playgroud)

Geo*_*ios 6

鉴于您的表很宽,索引很窄,在加载后在表上创建非聚集索引应该是首选。

在这种情况下,我会:

  1. 创建具有聚集索引的新表 - 这是因为将堆转换为聚集索引的过程在计算上是昂贵的。
  2. 将数据加载到表中,按照聚集索引 SwapData_ID 的顺序
  3. 使用 BULK INSERT(确保操作最少记录),加载到表中
  4. 创建非聚集索引

鉴于您的情况,上述方法应该是最佳的。

当然还有其他问题:

数据漂移(在你的加载过程中源数据会发生变化吗?是否需要跨越这些变化)

DR(是否启用日志传送?在这种情况下,可能需要将恢复模型更改为批量记录)

日志文件大小(您需要确保日志文件足够大以容纳非聚集索引创建)

预先调整数据库(确保它在加载期间不会自动增长)

但这些似乎都在你所问的范围之外。