基于SQL的数据差异:最长的公共子序列

Rem*_*anu 12 sql algorithm diff

我正在寻找研究论文或着作,将最长公共次序算法应用于SQL表以获得数据差异视图.关于如何解决表差异问题的其他方法也受到欢迎.挑战在于SQL表有这种令人讨厌的习惯,即相当大而且应用为文本处理而设计的简单算法可能会导致程序永无止境......

给一张桌子Original:

Key  Content
1    This row is unchanged
2    This row is outdated
3    This row is wrong
4    This row is fine as it is
Run Code Online (Sandbox Code Playgroud)

和表New:

Key Content
1   This row was added
2   This row is unchanged
3   This row is right
4   This row is fine as it is
5   This row contains important additions
Run Code Online (Sandbox Code Playgroud)

我需要找出Diff:

+++ 1 This row was added
--- 2 This row is outdated
--- 3 This row is wrong
+++ 3 This row is right
+++ 5 This row contains important additions
Run Code Online (Sandbox Code Playgroud)

小智 1

如果将表格导出到 csv 文件,可以使用http://sourceforge.net/projects/csvdiff/

引用: csvdiff 是一个 Perl 脚本,用于比较/比较两个 csv 文件,并可以选择分隔符。差异将显示为:“记录 999 中的 XYZ 列”不同。此后,将显示该列的实际结果和预期结果。