什么是关系完整性

dzh*_*zhu 3 data-integrity relational-database relational-model

在关于如何在关系中设置主键的问题中,PerformanceDBA谈到了关系完整性,并指出它与参照完整性不同.

我听说过与外键有关的参照完整性. 但是关系完整性对我来说似乎很奇怪.

在这个问题中,关系完整性和引用完整性是一回事吗?,克里斯说,两者是一回事.

数据库世界中的术语和定义让我感到困惑.
有人可以解释关系完整性吗?


Edgar F. Codd 于1970年他的论文中提出关系模型以来,关系数据库理论已经建立了几十年. 但是,书籍或网络上的正常形式,完整性等没有一致的定义. 作为一个学习者,我很困惑.

Per*_*DBA 6

困惑

我会按照整体意义上的顺序回答你的问题,并减少长度.

自从Edgar F. Codd于1970年在他的论文中提出关系模型以来,关系数据库理论已经建立了几十年.然而,书籍或网络上的正常形式,完整性等没有一致的定义.

作为一个学习者,我很困惑.

数据库世界中的术语和定义让我感到困惑.

我想说,这是数据库行业中最大的问题.有两个营地.

科德营地

  1. 我们坚持EF Codd博士的定义.
  2. 这些定义并不令人困惑
  3. 这些定义完全相互整合
  4. 一个人不需要成为理论家或"理论家"来理解它们.它们用简单的技术英语编写,并由数学定义支持.
  5. Codd制作了第一个关系代数.

换句话说,任何有资格的人都可以理解它们并使用它们(一个人不需要能够阅读和理解数学定义.)

高端的RDBMS供应商正处于这个阵营中.几十年来,他们与客户一起领导SQL委员会实现关系模型和SQL数据子语言的功能和设施(请注意,SQL不是一种语言,而是一种用于访问数据库的子语言).

一个结果是,关系模型的自然发展.批评者建议的特征是"不完整",已经"完成"并实施.忠实地实现关系模型的任何人都会将其视为自然进展,而不是"扩展".

所有小而重要的进步和进步都是由为高端供应商工作的真正的理论家和科学家实施的.(也就是说,无论是我,也不是客户,还是供应商,所有提出这些进展的供应商都提出这些进展,表明关系模型是"不完整的",或者我们"完成"它.我们很高兴地接受这样的不是关系模型的 "外部" .)

非SQL和假装SQL不在此类别中.

"理论家"

真正的理论家和科学家的日子早已不复存在.我们现在所拥有的是与他们声称要服务的行业完全隔离的人; 否认其他科学(例如物理定律,逻辑定律和常识定律); 谁写了他们引用的学术论文,因此提升了.学术论文和数学证明仅仅基于这样一个事实:在这种否定和孤立中,它们证明了他们建议证明的内容.事实证明,证据在现实世界中是纯粹的垃圾,如果相关的科学被否定,它只能是"真实的",与它们无关.

这种否认现实是精神分裂的.不幸的是,他们这些天在大学教授它.

因此,你认为论文纯粹是因为他们有许多引用(来自精神分裂症患者),而不是基于它是否是科学.

对于没有受过"教育"精神分裂症的人来说,很容易销毁这些文件.在另一个答案中,我已经证明了这一点:

  • 对Hidders的回应(他在其中建议"关系数据库"中的"问题",他建议"解决",以及在将数据置于关系上下文中的简单行为中"问题" 消失.

  • 对Köhler的回应(他在"关系数据库"中提出了一个"问题",他建议通过创建另一个新的"正常形式"来"解决",并且在放置数据的简单行为中"问题" 消失了在关系上下文中.

与"理论家"所产生的论文建立在稻草人方法基础上的证据分开的底线是虚假的,这证明他们对他们所声称要写的关系模型毫无头绪.

它是大规模的欺诈,因为它建立在新的"教育"系统中.

后哥德古拉格

这个阵营最初由"理论家"组成.自关系模型发布以来的四十五年中,它们并没有产生一个先进或进步的东西.

现在它由所有粉丝和支持者组成.因为他们有书籍,现在被用作大学课程的教科书,它包括所有教授这种垃圾的"教授",如鹦鹉,而不是为自己验证理论.

然而,他们产生了一片片段,他们声称这些片段是"关系型的".他们将Codd的定义分解成小块,并单独处理每一个,要么攻击Codd定义,要么支持他们自己的碎片.

  • 例如.Codd对1NF,2NF,3NF(最后包括功能依赖的定义)有明确和直接的定义,并且任何有能力的人都可以应用它们,这些生物有:

    • NF的六个片段("1NF","2NF","3NF","BCNF"(又名"3.5NF"),"4NF","5NF")

    • 综合起来,甚至不包括Codd的三个NF覆盖的部分内容.

    • 这就是为什么我在其他答案中说过,如果一个人理解并应用Codd的三个NF,在精神和文字方面,它涵盖了上述六个NF片段,以及尚未编写的任何NF片段(由"理论家"编写) ").

  • DKNF is the ultimate NF, and clearly the goal of the Relational Model (to anyone who is genuinely trying to understand it), but not defined as such. This is one of the natural progressions (above), and obvious to a faithful implementer. While I wouldn't say it is easy, it is entirely feasible: all my databases since 2007 are DKNF.

    • But the "theoreticians" have defined a tiny fraction of DKNF, and published it as "DKNF", and they flatly state that it is impossible.

Of course, their fragments are incomplete, they are forever finding "holes"in their definitions, and "filling" them. There is no end to the "holes", and no end to the "fillers". Codd's definitions have no holes, and do not need fillers.

And of course, there is no integration of their numerous fragments. It is the very definition of dis-integration.

They produced a number of "relational algebras" (I can't keep count) in competition with Codd's Relational Algebra. Of course, none of them comes even close, and Codd's is strongly established as the Relational Algebra.

They claim that the Relational Model is somehow "incomplete", and that their inventions "complete" it. They don't. Their inventions stand as evidence that they do not understand the Relational Model that they propose to "complete". It is part of the fraud, to elevate their otherwise incredible inventions.

In doing all of this, of course they have introduced confusion. That is their goal, as detractors of the Relational Model. Their 42 or so "relational models" and "relational algebras" can only exist in a state of confusion, where practitioners and seekers are confused about what the Relational Model is.

This answer is long, only because, in order to answer your question, I have to first remove the confusion, what the Relational Model is not, and second, confirm what the Relational Model is. If these creatures had not introduced all this fraud, this confusion, if the Relational Model was clear, and the definitions were not sabotaged, the answer to the question is simple and straight-forward:

  • Relational integrity is the form of data integrity that is afforded by the Relational Model, that is not afforded by pre-relational Record Filing Systems.

But for the confusion, that is not enough, I have to provide more detail, and proof, to destroy the confusion.

And notice, they have established the notion that everything is a matter of opinion; of argument; it is subjective. Of course, the truth is objective, not open to opinion; discussion; or argument. It is easy to prove: just read the Relational Model and see for yourself if something is defined, and what that definition is.

The Difference

The main difference between the Codd Camp and the "Theoreticians" Gulag, is this:

  1. The post-Codd authors do not understand the Relational Model. Over the four decades, separate to the fact that they have not added one iota to the Relational Model, they have established a private "relational model" (actually several), with their private definitions and private terms. The evidence for this is four-fold:

    • First, they understand only a tiny fraction of the Relational Model. They are unaware of (eg) Relational Keys; Relational Integrity.

    • Second, they propagate various inventions that are specifically prohibited in the Relational Model: (eg) surrogates; Access Path Dependence; circular references. Again, that is only possible if they are ignorant of the Relational Model

    • Third, that they have private definitions for terms, that do not match the definitions in the Relational Model. That alone guarantees that they are divorced from, and do not serve, the industry that they allege to serve. Further, they cannot converse with any Relational Model practitioner.

    • Fourth, that they have terms (and private definitions) for fabrications that are not in the Relational Model. But they fraudulently declare such inventions to be part of the "relational model".

In sum, it means that what they practice and preach as "relational" or "relational model", is far from Relational, and in fact, by virtue of the evidence, it is Anti-relational.

  1. Since they do not understand the Relational Model, what is it that they do understand, that they propagate ?

    Well, from the mountain of evidence (ie. their own writings: books, textbooks, academic papers, etc), they only understand and propagate the technology we had before the advent of the Relational Model and RDBMS: pre-1970's Record Filing Systems.

    • Which have none of the Integrity (we will get to that in detail), or power, or speed, of Relational databases (ie. one that complies with the Relational Model).

    • If one were asked to describe the difference between pre-relational DBMS and Relational DBMS, in one sentence, it would be that pre-relational systems related records by Record ID, with no control of the content of records, whereas Relational system related rows by Relational Key, with full control (integrity) of the row content.

The Visible Difference

The difference identified above is theoretical, understandable to people who have a grounding in theory (and denied by the "theoreticians"). However the difference is quite visible even to novices, in two items (there are many, I am exposing the two really obvious ones), and here I can provide specific evidence:

  1. The Relational Model demands a Primary Key, which is then used as a Foreign Key, to establish relationships. The detractors implement Record IDs as "primary key", which:

    • fail the definition of Key in the Relational Model. (The Relational Model definition of Key is it must be made up from the data. Record IDs, GUIDs, etc, are manufactured, they are not data.)

    • implements Access Path Dependence, which is specifically prohibited in the Relational Model. Access Path Dependence was the characteristic limitation of pre-relational Record Filing Systems.

    • (This leads to having to navigate every file between tow distal files, whereas in the Relational Model, two distal tables can be JOINed directly.)

    • (Which by the way, is proof that the Record Filing Systems in fact, require more, not less, JOINs, contrary to the mythology.)

  2. Thus they elevate surrogates, Record IDs, to the status of "key".

    • But it is pure fraud. There is no such thing as a "surrogate key" in the Relational world, because the two words contradict each other, either it is a surrogate (a non-key) xor it is a Key (a non-surrogate).

    • Further, by using the term "surrogate key", one naturally expects at least some, if not all, the qualities of a Key, and a surrogate has none of the qualities of a Key.

    • That is "normal" in the world of the "theoreticians", which is divorced from the Relational world, because they do not have Keys, they have surrogates as "keys"

  3. They have their invention, along with their private definition, the "candidate key".

    • It is an invention to hide the fact that they are not using Primary Key as defined in the Relational Model, which in and of itself, is a breach of the Relational Model.

    • Again, it is pure fraud, because there is no such thing defined in the Relational Model. So, whatever it is, it is outside the Relational Model

    • What is it ? It is a fragment of a Key. a surrogate alone does not provide row uniqueness, which is demanded in the Relational Model. And they need row uniqueness to establish that small fraction of integrity in their Record Filing System.

    • Now, note, at this point, to be complete, therefore a surrogate is always an additional column, ie. additional to the data columns. It is never an either/or proposition, as many novices (such as Fowler and Ambler, such as those who propose it to be an "opinion", that hasn't reached "consensus") propose it to be.

    • It is a fragment of a Relational Key because the "candidate key" is not implemented as a Key in the Record Filing System, across files. It is implemented, only in the single File that it is defined in, and therefore not used Relationally. Whereas in the Relational Model, such a Key would be the Primary Key, and it would be a Foreign Key in every child of that table, ie. it would be used Relationally, across tables.

    • Thus is remains a fragment of a Relational Key, existing only in the file it is resident in.

  4. Of course "candidate keys" do not work. So they have come up with yet another invention, to make it work. The "superkey". It doesn't work either, and requires massive duplication. It doesn't provide any of the qualities of a Relational Key. It only provides a fragment, a step, above the failed "candidate key", and thus the whole thing remains a failure.

  5. Take Functional Dependence. Because of the introduced confusion and sabotage, we have to call it Full Functional Dependence. Codd' 3NF Definition gives the definition of Functional Dependence as well. This is in plain technical English, it is easily understood, and implemented.

    The most important thing to understand is, since we are talking about the Relational Model, when Codd uses the term Key, he means a Relational Key. therefore the Key has to be determined and available first, before Functional Dependence of attributes on the Key can be tested (it is a test), second.

    • But the saboteurs, the subverters, have invented fragments of the 3NF definition. I won't go into in detail (unless asked), however, if you examine them, you will see that they serve one purpose: to elevate their non-relational "candidate key" fraudulently to the status of a Key.

    • Further, their entire set of definitions of their fragments (seven or so) that relate to Codd's 3NF definition, is complex, ordinary implementers cannot understand it, let alone implement it.

Summary. If anyone uses the terms { "candidate key", "superkey", "partial/transitive dependence" [rather than Full Functional Dependence] }, they are identifying themselves, categorically, as being (a) ignorant of the Relational Model, and (b) Anti-relarional.

The Result

The result, and this is the real accomplishment of the "theoreticians", is that 95% of the implementers out there implement Record Filing Systems, that have none of the Inteegrity, power, or speed of Relational databases, but they think that their RFS is "relational". It is a great pity that the "theoreticians" cannot acknowledge, and enjoy their one and only accomplishment.

Declarations of "Theoreticians"

This is what we have to understand, in order to penetrate the confusion and to identify falsities as such. If you understand that the "theoreticians" are heavily invested in their 42 or so "relational models" and "relational algebras", that they are quite different to the Relational Model, you will understand that actually, they are quite clueless about the Relational Model.

But this does not stop them from making declarations about the Relational Model, what it does, what it can't do, etc. Therefore do not believe any pronouncement that they make, it is like a pygmy making an pronouncement about airplane flight (refer the Gods Must Be Crazy).

The Question

The relational database theory has been established for decades

True. But only for the high end of the industry. Guys like me. Guys who have a sound grounding in theory and science, and who reject the non-science of the post-Codd era. The majority, 95%, and schooled in the anti-relational system.

In the question How to Set Up Primary Keys in a Relation, PerformanceDBA talked about Relational Integrity and pointed out that it is different from Referential Integrity.

Yes.

Could someone explain the Relational Integrity?

Yes. But only a genuine Relational practitioner can do that.

Be warned, due to the state of the Relational database industry, the introduced confusion, the massive fraud being perpetrated, as detailed above, the saboteurs and subverters will say, either there is no such thing, or there is, but it is impossible to implement, or that it is the same as Referential Integrity.

  • That will, yet again, prove two separate things (a) that they are ignorant of the Relational Model, and (b) that they are Anti-relational.

I have heard of Referential Integrity, which is related to Foreign Keys. But Relational Integrity seems strange to me.

In this question Are relational integrity and referential integrity the same thing?, Chris said the two are the same thing.

No, they are not.

As defined above, that statement proves he is clueless about the Relational Model, and that he is Anti-relational. As such, he cannot know what the Relational Model is, what Relational Integrity is. They do not know what they are missing, so they cannot describe or define it.

I can.

Let's start with Referential integrity, so that we know (a) what that is, and (b) how that is different to Relational Integrity.

We need a decent example to work with. Note that frauds and thieves use simple examples, because anything can be proved (or disproved) using simple, trite examples. Deep understanding requires full examples, that are "complex" enough to demonstrate the issue.

Let's use an example that I have given the "theoreticians" on comp.databases.theory to solve. They couldn't solve it. I gave pointers and hints. They still couldn't solve it.

  • That stands, in and of itself, as evidence that the "theoreticians" cannot Normalise Anything. Despite the fact that they have 17 mathematical definitions for their abnormal, fragmented, "normal forms".

  • We really should be shouting that from the rooftops. They are no position to be telling practitioners anything.

Here is a typical implementation by a developer who has been reading the books of the detractors, and following them carefully. As is typical, he thinks this is "relational". But it isn't Relational at all, it is a Record Filing System, with none of the Integrity, power, or speed of a Relational Database.

The content should be familiar to everyone, therefore I will skip the description. Note that there are ISO and ANSI/FIPS Standard Codes for the first three levels, eg. ISO-3166-1, 3166-2, and FIPS.

  • Typical Record Filing System Implementation declared as "relational"

    • The developer has learned about row uniqueness, and implemented Alternate Keys to provide that. These are more advanced than commonly implemented, but hey, I am not attempting a Straw Man argument, these are the best "candidate keys" that anyone has come up with. Eg. in the State File, he correctly asserts that Name and StateCode must be unique within a Country.
  • But his "primary keys" are not Primary Keys, they are Record IDs.

  • The developer has declared this set of files as "satisfies 5NF". The "theoreticians" have passed this as such.

As far as Codd and I are concerned, (a) it fails 3NF and (b) it fails Relational. But we won't be dealing with that here, we just need a good example to use for our purpose, Relational Integrity.

Let's look at the DDL for the Country File.

    CREATE TABLE Country (
        CountryId     INT       NOT NULL  IDENTITY PRIMARY KEY,
        CountryCode   CHAR(2)   NOT NULL,
        Name          CHAR(30)  NOT NULL,
        CONSTRAINT AK1 UNIQUE ( CountryCode ),
        CONSTRAINT AK2 UNIQUE ( Name )
        )
    INSERT Country VALUES
        ( 'US', 'United States of America'),
        ( 'CA', 'Canada' ),
        ( 'AU', 'Australia' )
Run Code Online (Sandbox Code Playgroud)

So far so good. Let's look at the State File.

    CREATE TABLE State (
        StateId    INT       NOT NULL  IDENTITY PRIMARY KEY,
        CountryId  INT       NOT NULL,
        StateCode  CHAR(2)   NOT NULL,
        Name       CHAR(30)  NOT NULL
        CONSTRAINT AK1 UNIQUE ( CountryId, StateCode ),
        CONSTRAINT AK2 UNIQUE ( CountryId, Name ),
        CONSTRAINT Country_ConsistsOf_State_fk
            FOREIGN KEY        ( CountryId ) 
            REFERENCES Country ( CountryId )
        )
    INSERT State VALUES
        ( 1, 'AL',  'Alabama' ),  
        ( 1, 'GA',  'Georgia'),
        ( 1, 'NY',  'New York'),
        ( 2, 'NT',  'Northwest Territories'),
        ( 3, 'NT',  'Northern Territory')
Run Code Online (Sandbox Code Playgroud)

Notice that (eg) both Canada and Australia have a StateCode "NT", and the Alternate Keys allow that. But also note that when inserting States, we are forced to use a Record ID, instead of data, to identify the parent Country of the State that is being inserted. That should ring alarm bells.

So far so ordinary. Let's look at the County File.

    CREATE TABLE County (
        CountyId    INT       NOT NULL  IDENTITY PRIMARY KEY,
        StateId     INT       NOT NULL,
        CountyCode  CHAR(2)   NOT NULL,
        Name        CHAR(30)  NOT NULL
        CONSTRAINT County_UK1 ( StateId, CountyCode ),
        CONSTRAINT County_UK2 ( StateId, Name ),
        CONSTRAINT State_IsMadeUpOf_County_fk
            FOREIGN KEY              ( StateId ) 
            REFERENCES State ( StateId )
        )
    INSERT County VALUES
        ( 1, 'LE', 'Lee' ),  
        ( 2, 'LE', 'Lee'),
        ( 3, 'LE', 'Lee')
Run Code Online (Sandbox Code Playgroud)

When inserting Counties, we are forced to use a Record ID, instead of data, to identify the parent State

  • 如果一个问题需要*这个*很长的答案,或许可以考虑选择投票将它关闭得过于宽泛? (6认同)
  • @PerformanceDBA当我有时间时,习惯读取你的答案.我的理解是,参考完整性是DBMS通过外键约束提供的.通过关系完整性,您指的是由关系密钥(由数据组成;通常是复合键)提供的完整性,并由外键约束实现.使用代理作为"密钥"a)不能保证行唯一性b)导致更多的连接c)并将每个实体视为独立. (2认同)