Del*_*ang 22 sql postgresql change-tracking
我期待设计一个数据库,跟踪每一组变化,以便我将来可以参考它们.例如:
Database A
+==========+========+==========+
| ID | Name | Property |
1 Kyle 30
Run Code Online (Sandbox Code Playgroud)
如果我将行的'property'字段更改为50,它应该将行更新为:
1 Kyle 50
Run Code Online (Sandbox Code Playgroud)
但是应该保存行的属性在某个时间点为30的事实.然后,如果该行再次更新为70:
1 Kyle 70
Run Code Online (Sandbox Code Playgroud)
应该保留行的属性为50和70的两个事实,这样我可以检索一些查询:
1 Kyle 30
1 Kyle 50
Run Code Online (Sandbox Code Playgroud)
它应该认识到这些是在不同时间点的"相同条目".
编辑:此历史记录需要在某个时间点呈现给用户,因此理想情况下,应该了解哪些行属于同一"修订群集"
处理此数据库设计的最佳方法是什么?
Cha*_*ana 20
一种方法是为MyTableNameHistory数据库中的每个表创建一个,并使其模式与表的模式相同MyTableName,只是History表的主键有一个名为effectiveUtcDateTime的附加列.例如,如果您有一个名为的表Employee,
Create Table Employee
{
employeeId integer Primary Key Not Null,
firstName varChar(20) null,
lastName varChar(30) Not null,
HireDate smallDateTime null,
DepartmentId integer null
}
Run Code Online (Sandbox Code Playgroud)
那么历史表就是
Create Table EmployeeHistory
{
employeeId integer Not Null,
effectiveUtc DateTime Not Null,
firstName varChar(20) null,
lastName varChar(30) Not null,
HireDate smallDateTime null,
DepartmentId integer null,
Primary Key (employeeId , effectiveUtc)
}
Run Code Online (Sandbox Code Playgroud)
然后,您可以在Employee表上放置一个触发器,这样每次在Employee表中插入,更新或删除任何内容时,都会在EmployeeHistory表中插入一条新记录,其中包含所有常规字段的完全相同的值,并且当前effectiveUtc列中的UTC日期时间.
然后,要在过去的任何点找到值,只需从历史表中选择其有效Utc值是您希望该值为asOf日期时间之前的最高值的记录.
Select * from EmployeeHistory h
Where EmployeeId = @EmployeeId
And effectiveUtc =
(Select Max(effectiveUtc)
From EmployeeHistory
Where EmployeeId = h.EmployeeId
And effcetiveUtc < @AsOfUtcDate)
Run Code Online (Sandbox Code Playgroud)
最好的方法取决于你在做什么。您想要更深入地了解缓慢变化的维度:
https://en.wikipedia.org/wiki/Slowly_changing_dimension
在 Postgres 9.2 中也不要错过 tsrange 类型。它允许将start_date和合并end_date到单个列中,并使用 GIST(或 GIN)索引以及排除约束来索引内容,以避免重叠日期范围。
编辑:
应该了解哪些行属于同一“修订簇”
在这种情况下,您希望表格中以某种方式显示日期范围,而不是修订号或实时标志,否则您最终会在各处复制相关数据。
另外,请考虑将审计表与实时数据区分开来,而不是将所有内容都存储在同一个表中。它更难实施和管理,但它可以更有效地查询实时数据。
另请参阅此相关文章:时态数据库设计,有一些变化(实时行与草稿行)
要添加到Charles的答案中,我将使用Entity-Attribute-Value模型,而不是为数据库中的每个其他表创建一个不同的历史记录表。
基本上,您将像这样创建一个 History表:
Create Table History
{
tableId varChar(64) Not Null,
recordId varChar(64) Not Null,
changedAttribute varChar(64) Not Null,
newValue varChar(64) Not Null,
effectiveUtc DateTime Not Null,
Primary Key (tableId , recordId , changedAttribute, effectiveUtc)
}
Run Code Online (Sandbox Code Playgroud)
然后,您可以在History任何一个表中创建或修改数据时创建一条记录。
To follow your example, when you add 'Kyle' to your Employee table, you would create two records (one for each non-id attribute), and then you would create a new record every time a property changes:
History
+==========+==========+==================+==========+==============+
| tableId | recordId | changedAttribute | newValue | effectiveUtc |
| Employee | 1 | Name | Kyle | N |
| Employee | 1 | Property | 30 | N |
| Employee | 1 | Property | 50 | N+1 |
| Employee | 1 | Property | 70 | N+2 |
Run Code Online (Sandbox Code Playgroud)
Alternatively, as a_horse_with_no_name suggested, if you don't want to store a new History record for every field change, you can store grouped changes (such as changing Name to 'Kyle' and Property to 30 in the same update) as a single record. In this case, you would need to express the collection of changes in JSON or some other blob format. This would merge the changedAttribute and newValue fields into one (changedValues). For example:
History
+==========+==========+================================+==============+
| tableId | recordId | changedValues | effectiveUtc |
| Employee | 1 | { Name: 'Kyle', Property: 30 } | N |
Run Code Online (Sandbox Code Playgroud)
This is perhaps more difficult than creating a History table for every other table in your database, but it has multiple benefits:
One architectural benefit of this design is that you are decoupling the concerns of your app and your history/audit capabilities. This design would work just as well as a microservice using a relational or even NoSQL database that is separate from your application database.
| 归档时间: |
|
| 查看次数: |
15808 次 |
| 最近记录: |