我有一个SQL Server数据库,其中包含一些显示第三方数据库(OpenEdge)更改的审计记录.我无法控制审计数据的结构,也无法控制第三方数据库审计数据更改的方式.所以我留下了,例如,以下数据......
如果您按照前五行进行操作,则可以看到它们都属于TransId 1532102(表示数据库事务),其中TransSeq表示单个事务中的数据库操作.
在列前缀中New,审核更改是可见的.如果值为NULL,则不会对该字段进行任何更改.
查看数据,您可以看到TransId = 1532102,其中PrimaryIdentifier从2更改为-2(第1行),然后从-2更改为3(第3行),然后从3更改为4(第4行),最后从4更改为4到5(第5行).您可能还注意到,当PrimaryIdentifier从3更改为4时,SecondaryIdentifier从'abcd'更改为'efgh'(第4行).因此,这些多个更改实际上只发生在单个源记录上.因此,考虑到这一点,第1,3,4和5行都可以压缩成一行(见下文)
最终TransId 1532102只有两项记录变化.
我需要将这些更改转换为目标数据库上的单个UPDATE语句.为了做到这一点,我需要确保我有一条记录显示前后值.
因此,鉴于此处提供的源数据,我需要生成以下数据集.
我可以使用哪些查询结构来实现此目的?我在考虑递归CTE或者使用Hierarchical结构?最终我需要这个以尽可能好的表现,所以我想在这里提出问题,以防我没有考虑所有可能的方法.
欢迎思考,这是一个示例数据的脚本
DECLARE @TestTable TABLE (SyncId INT, TransId INT, TransSeq INT, PrimaryIdentifier INT, SecondaryIdentifier NCHAR(4), NewPrimaryIdentifier INT, NewSecondaryIdentifier NCHAR(4), NewLevel INT, NewValue NVARCHAR(20))
INSERT @TestTable
SELECT 128, 1532102, 0, 2, 'abcd', -2, NULL, NULL, 'test data'
UNION SELECT 128, 1532102, 1, 3, 'abcd', 2, NULL, NULL, NULL
UNION SELECT 128, 1532102, 2, -2, 'abcd', 3, NULL, NULL, NULL
UNION SELECT 128, 1532102, 3, 3, 'abcd', 4, 'efgh', NULL, NULL
UNION SELECT 128, 1532102, 4, 4, 'efgh', 5, NULL, 2, NULL
UNION SELECT 128, 1532102, 5, 5, 'efgh', NULL, 'ghfi', NULL, NULL
UNION SELECT 128, 1532106, 0, 3, 'abcd', -3, NULL, NULL, NULL
UNION SELECT 128, 1532106, 1, 4, 'abcd', 3, NULL, NULL, NULL
UNION SELECT 128, 1532106, 2, -3, 'abcd', 4, NULL, NULL, NULL
UNION SELECT 128, 1532110, 0, 4, 'abcd', -4, NULL, NULL, NULL
UNION SELECT 128, 1532110, 1, 5, 'abcd', 4, NULL, NULL, NULL
UNION SELECT 128, 1532110, 2, -4, 'abcd', 5, NULL, NULL, NULL
UNION SELECT 128, 1532114, 0, 5, 'abcd', -5, NULL, NULL, NULL
UNION SELECT 128, 1532114, 1, 4, 'abcd', 5, NULL, 1, NULL
UNION SELECT 128, 1532114, 2, -5, 'abcd', 4, NULL, NULL, 'some more test data'
SELECT *
FROM @TestTable
Run Code Online (Sandbox Code Playgroud)
编辑: 我实际上无法编写任何成功跟踪标识符更改的查询.任何人都可以提供帮助 - 我需要一个跟踪PrimaryIdentifier值变化的查询,并最终为每个跟踪提供单个记录,包括起始值和结束值.
编辑2: 有一个删除的答案表明,在压缩时无法更新密钥标识符,而是我应该逐步完成更改.我认为将我的评论添加到问题的进一步信息是有价值的..
由于生成审计记录的数量,我需要压缩数据集; 由于源DBMS进行更改的方式,其中大多数是不必要的.我需要减少数据集,我需要跟踪关键标识符更改.在更新语句期间,可以在不更改ID更改的情况下进行更新 - 请参阅此示例.
I assume that
1) (PrimaryIdentifier, SecondaryIdentifier) is a PK of the target table,
2) Every transacton in the audit table leaves target table in a consistent state.
So the update of the PK in a single statement for every transaction using case will run OK:
declare @t table (id int primary key, old int);
insert @t(id, old) values (4,4),(5,5);
update @t set id = case id
when 4 then 5
when 5 then 4 end;
select * from @t;
Run Code Online (Sandbox Code Playgroud)
The plan is 1. Condense transactions 2. Generate update sql into temp table. Then you can run all or selected items from the temp table. Every item is of the form
UPDATE myTable SET
PrimaryIdentifier = CASE WHEN PrimaryIdentifier=2 AND SecondaryIdentifier='abcd' THEN 5
WHEN PrimaryIdentifier=3 AND SecondaryIdentifier='abcd' THEN 2 END,
SecondaryIdentifier = CASE WHEN PrimaryIdentifier=2 AND SecondaryIdentifier='abcd' THEN 'efgh'
WHEN PrimaryIdentifier=3 AND SecondaryIdentifier='abcd' THEN 'abcd' END ,
Level= CASE WHEN PrimaryIdentifier=2 AND SecondaryIdentifier='abcd' THEN 2
WHEN PrimaryIdentifier=3 AND SecondaryIdentifier='abcd' THEN Level END ,
Value= CASE WHEN PrimaryIdentifier=2 AND SecondaryIdentifier='abcd' THEN 'test data'
WHEN PrimaryIdentifier=3 AND SecondaryIdentifier='abcd' THEN Value END
WHERE 1=2 OR (PrimaryIdentifier=2 AND SecondaryIdentifier='abcd')
OR (PrimaryIdentifier=3 AND SecondaryIdentifier='abcd')
Run Code Online (Sandbox Code Playgroud)
The query
DECLARE @TestTable TABLE (SyncId INT, TransId INT, TransSeq INT, PrimaryIdentifier INT, SecondaryIdentifier NCHAR(4), NewPrimaryIdentifier INT, NewSecondaryIdentifier NCHAR(4), NewLevel INT, NewValue NVARCHAR(20))
INSERT @TestTable
SELECT 128, 1532102, 0, 2, 'abcd', -2, NULL, NULL, 'test data'
UNION SELECT 128, 1532102, 1, 3, 'abcd', 2, NULL, NULL, NULL
UNION SELECT 128, 1532102, 2, -2, 'abcd', 3, NULL, NULL, NULL
UNION SELECT 128, 1532102, 3, 3, 'abcd', 4, 'efgh', NULL, NULL
UNION SELECT 128, 1532102, 4, 4, 'efgh', 5, NULL, 2, NULL
UNION SELECT 128, 1532106, 0, 3, 'abcd', -3, NULL, NULL, NULL
UNION SELECT 128, 1532106, 1, 4, 'abcd', 3, NULL, NULL, NULL
UNION SELECT 128, 1532106, 2, -3, 'abcd', 4, NULL, NULL, NULL
UNION SELECT 128, 1532110, 0, 4, 'abcd', -4, NULL, NULL, NULL
UNION SELECT 128, 1532110, 1, 5, 'abcd', 4, NULL, NULL, NULL
UNION SELECT 128, 1532110, 2, -4, 'abcd', 5, NULL, NULL, NULL
UNION SELECT 128, 1532110, 3, 5, 'abcd', 6, NULL, NULL, NULL
UNION SELECT 128, 1532110, 4, 6, 'abcd', 5, NULL, NULL, NULL
UNION SELECT 128, 1532114, 0, 5, 'abcd', -5, NULL, NULL, NULL
UNION SELECT 128, 1532114, 1, 4, 'abcd', 5, NULL, 1, NULL
UNION SELECT 128, 1532114, 2, -5, 'abcd', 4, NULL, NULL, 'some more test data'
;
WITH root AS (
-- Top parent updates within transactions
SELECT SyncId, TransId, TransSeq, PrimaryIdentifier AS rPrimaryIdentifier, SecondaryIdentifier AS rSecondaryIdentifier,
NewPrimaryIdentifier,
coalesce(NewSecondaryIdentifier, SecondaryIdentifier) AS NewSecondaryIdentifier,
newLevel, NewValue
FROM @TestTable t
WHERE NOT EXISTS (SELECT 1
FROM @TestTable t2
WHERE t2.SyncId=t.SyncId AND t2.TransId = t.TransId
AND t2.TransSeq < t.TransSeq
AND t.PrimaryIdentifier = t2.NewPrimaryIdentifier
AND t.SecondaryIdentifier = coalesce(t2.NewSecondaryIdentifier, t2.SecondaryIdentifier)
)
-- recursion to track the chain of updates
UNION ALL
SELECT root.SyncId, root.TransId, t.TransSeq, rPrimaryIdentifier, rSecondaryIdentifier,
t.NewPrimaryIdentifier,
coalesce(t.NewSecondaryIdentifier, root.NewSecondaryIdentifier),
coalesce(root.NewLevel, t.NewLevel), coalesce(root.NewValue, t.NewValue)
FROM root
JOIN @TestTable t ON root.SyncId=t.SyncId AND root.TransId = t.TransId
AND root.TransSeq < t.TransSeq
AND t.PrimaryIdentifier = root.NewPrimaryIdentifier
AND t.SecondaryIdentifier = root.NewSecondaryIdentifier
)
,condensed as (
-- last update in the chain
SELECT TOP(1) WITH TIES *
FROM root
ORDER BY row_number() over (partition by SyncId, TransId, rPrimaryIdentifier, rSecondaryIdentifier
order by TransSeq desc)
)
-- generate sql
SELECT SyncId, TransId, sql = 'UPDATE myTable SET PrimaryIdentifier = CASE'
+ (SELECT ' WHEN PrimaryIdentifier='+ CAST(rPrimaryIdentifier as varchar(20))
+' AND SecondaryIdentifier=''' + rSecondaryIdentifier
+''' THEN ' + CAST(NewPrimaryIdentifier as varchar(20))
FROM condensed c2
WHERE c1.SyncId = c2.SyncId AND c1.TransId= c2.TransId
FOR XML PATH('') )
+ ' END, SecondaryIdentifier = CASE'
+ (SELECT ' WHEN PrimaryIdentifier='+ CAST(rPrimaryIdentifier as varchar(20))
+' AND SecondaryIdentifier=''' + rSecondaryIdentifier
+''' THEN ''' + NewSecondaryIdentifier + ''''
FROM condensed c2
WHERE c1.SyncId = c2.SyncId AND c1.TransId= c2.TransId
FOR XML PATH('') )
+ ' END , Level= CASE'
+ (SELECT ' WHEN PrimaryIdentifier='+ CAST(rPrimaryIdentifier as varchar(20))
+' AND SecondaryIdentifier=''' + rSecondaryIdentifier
+''' THEN '
+ CASE WHEN NewLevel IS NULL THEN ' Level ' ELSE CAST(NewLevel as varchar(20)) END
FROM condensed c2
WHERE c1.SyncId = c2.SyncId AND c1.TransId= c2.TransId
FOR XML PATH('') )
+ ' END , Value= CASE'
+ (SELECT ' WHEN PrimaryIdentifier='+ CAST(rPrimaryIdentifier as varchar(20))
+' AND SecondaryIdentifier=''' + rSecondaryIdentifier
+''' THEN '
+ CASE WHEN NewValue IS NULL THEN ' Value ' ELSE '''' + NewValue + '''' END
FROM condensed c2
WHERE c1.SyncId = c2.SyncId AND c1.TransId= c2.TransId
FOR XML PATH('') )
+ ' END'
+ ' WHERE 1=2'
+ (SELECT ' OR (PrimaryIdentifier='+ CAST(rPrimaryIdentifier as varchar(20))
+' AND SecondaryIdentifier=''' + rSecondaryIdentifier +''')'
FROM condensed c2
WHERE c1.SyncId = c2.SyncId AND c1.TransId= c2.TransId
FOR XML PATH('') )
INTO #UpdSql
FROM condensed c1
GROUP BY SyncId, TransId
SELECT *
FROM #UpdSql
ORDER BY SyncId, TransId
Run Code Online (Sandbox Code Playgroud)
EDIT
Taking into account NewPrimaryIdentifier can be NULL too. See added row at @TestTable. Sql generation skipped.
DECLARE @TestTable TABLE (SyncId INT, TransId INT, TransSeq INT, PrimaryIdentifier INT, SecondaryIdentifier NCHAR(4), NewPrimaryIdentifier INT, NewSecondaryIdentifier NCHAR(4), NewLevel INT, NewValue NVARCHAR(20))
INSERT @TestTable
SELECT 128, 1532102, 0, 2, 'abcd', -2, NULL, NULL, 'test data'
UNION SELECT 128, 1532102, 1, 3, 'abcd', 2, NULL, NULL, NULL
UNION SELECT 128, 1532102, 2, -2, 'abcd', 3, NULL, NULL, NULL
UNION SELECT 128, 1532102, 3, 3, 'abcd', 4, 'efgh', NULL, NULL
UNION SELECT 128, 1532102, 4, 4, 'efgh', 5, NULL, 2, NULL
UNION SELECT 128, 1532102, 5, 5, 'efgh', null, 'ghfi', null, NULL -- added
UNION SELECT 128, 1532106, 0, 3, 'abcd', -3, NULL, NULL, NULL
UNION SELECT 128, 1532106, 1, 4, 'abcd', 3, NULL, NULL, NULL
UNION SELECT 128, 1532106, 2, -3, 'abcd', 4, NULL, NULL, NULL
UNION SELECT 128, 1532110, 0, 4, 'abcd', -4, NULL, NULL, NULL
UNION SELECT 128, 1532110, 1, 5, 'abcd', 4, NULL, NULL, NULL
UNION SELECT 128, 1532110, 2, -4, 'abcd', 5, NULL, NULL, NULL
UNION SELECT 128, 1532110, 3, 5, 'abcd', 6, NULL, NULL, NULL
UNION SELECT 128, 1532110, 4, 6, 'abcd', 5, NULL, NULL, NULL
UNION SELECT 128, 1532114, 0, 5, 'abcd', -5, NULL, NULL, NULL
UNION SELECT 128, 1532114, 1, 4, 'abcd', 5, NULL, 1, NULL
UNION SELECT 128, 1532114, 2, -5, 'abcd', 4, NULL, NULL, 'some more test data'
;
WITH root AS (
-- Top parent updates within transactions
SELECT SyncId, TransId, TransSeq, PrimaryIdentifier AS rPrimaryIdentifier, SecondaryIdentifier AS rSecondaryIdentifier,
coalesce(NewPrimaryIdentifier, PrimaryIdentifier) AS NewPrimaryIdentifier,
coalesce(NewSecondaryIdentifier, SecondaryIdentifier) AS NewSecondaryIdentifier,
newLevel, NewValue
FROM @TestTable t
WHERE NOT EXISTS (SELECT 1
FROM @TestTable t2
WHERE t2.SyncId=t.SyncId AND t2.TransId = t.TransId
AND t2.TransSeq < t.TransSeq
AND t.PrimaryIdentifier = coalesce(t2.NewPrimaryIdentifier, t2.PrimaryIdentifier)
AND t.SecondaryIdentifier = coalesce(t2.NewSecondaryIdentifier, t2.SecondaryIdentifier)
)
-- recursion to track the chain of updates
UNION ALL
SELECT root.SyncId, root.TransId, t.TransSeq, rPrimaryIdentifier, rSecondaryIdentifier,
coalesce(t.NewPrimaryIdentifier, root.NewPrimaryIdentifier),
coalesce(t.NewSecondaryIdentifier, root.NewSecondaryIdentifier),
coalesce(t.NewLevel, root.NewLevel), coalesce(t.NewValue, root.NewValue)
FROM root
JOIN @TestTable t ON root.SyncId=t.SyncId AND root.TransId = t.TransId
AND root.TransSeq < t.TransSeq
AND t.PrimaryIdentifier = root.NewPrimaryIdentifier
AND t.SecondaryIdentifier = root.NewSecondaryIdentifier
)
,condensed as (
-- last update in the chain
SELECT TOP(1) WITH TIES *
FROM root
ORDER BY row_number() over (partition by SyncId, TransId, rPrimaryIdentifier, rSecondaryIdentifier
order by TransSeq desc)
)
SELECT *
FROM condensed
ORDER BY SyncId, TransId, rPrimaryIdentifier, rSecondaryIdentifier
Run Code Online (Sandbox Code Playgroud)