tim*_*imz 3 postgresql database-design
我有3张桌子。1 表(表 A)有某种疾病(例如心脏病)的患者,另一个表是有各种疾病的患者(表 B)。第三个表是结果表。
我想要做的是将表 A 中的心脏病患者(表 A 仅是心脏病患者)与表 B 中的肾脏疾病患者进行比较。如果表 B 中存在命中(返回结果),则更新结果表该 ID 列肾脏疾病下的“x”。
结果表目前采用以下格式:
ID|肾脏|CHD|糖尿病……还有另外10种疾病
总结:我们用指定的疾病名称检查表 A 与表 B,如果有返回结果patientid,则结果表更新为该特定疾病名称下的“x” patientID。然后我将重新运行查询更改疾病名称(例如从肾脏到癌症)以查明心脏病患者是否患有癌症?
病人编号 1 2 3 4 5 ....一路到812(812心脏病患者)
患者 ID|年龄|疾病 1|疾病 2|疾病 3...一直到疾病 10 1 50 肾肺 2 35 肾心糖尿病 3 94 癌性肺炎 CHD
想要发生的是,对于PatientID表 A 中的每个,我们将表 B 中疾病 1-10 的特定疾病(例如肾脏)与表 B 进行比较,如果有结果,我们然后用'更新结果表x' 在结果表中该特定疾病列下。因此,在我们的示例中,对于患者 1 和 2,肾脏下方应该有一个“x”。
使用我们的样本数据,当我们运行肾脏时,结果表应该如下所示:
ID|肾脏|冠心病|糖尿病 1 次 2 x
谁能指出我为此编写查询的方向?该查询patientID从表 A 中检查所有(其中 812 个)表 B 中的特定疾病,如果它返回结果,则更新该疾病和结果表中带有“x”的患者。
如果有更好的方法来做到这一点,请告诉我如何去做!
I propose implementing a different and more versatile database structure, because the addition of a (base) table that retains persons or patients that are affected by only one particular disease introduces update anomalies and is unnecessary.
I consider that the following business rules formulations are of prime importance:
A Person is affected by zero-one-or-many DiseasesA Disease affects one-to-many PersonsTherefore, you want to implement a common many-to-many (M:N) relationship.
Then, from such formulations, I have derived an IDEF1X1 logical model that is shown in Figure 1:

As demonstrated, it is very important to define the entity types that are involved in the business domain under consideration, their attributes and how they are interrelated (including the cardinality of the relevant relationships).
In this case Person and Disease are independent entity types, each with their own attributes, and they are interrelated by way of an associative entity type, that I called PersonDisease, which holds the attributes that pertain to the relationship that takes place between the two mentioned entity types.
Each PersonDisease occurrence (or row once it is stored in a SQL database) is uniquely identified by PersonId along with DiseaseNumber, so I fixed this combination of attributes as the PRIMARY KEY of this associative entity type.
Derivable data
This model indicates that the Table B included in your answer is derivable data (obtained via DML operations that combine or compute columns from multiple tables), and not an entity type, neither independent nor associative; therefore you should not set it up as a base table.
Consequently, I created the following DDL structure in terms of the logical model discussed above:
-- You should determine which are the most fitting
-- data types and sizes for all your table columns
-- depending on your business context characteristics.
-- Also, you should carry out some testing sessions to define
-- the most convenient INDEXing strategies.
-- As one would expect, you are free to make use of
-- your preferred (or required) naming conventions.
CREATE TABLE Person
(
PersonId INT NOT NULL,
FirstName CHAR(30) NOT NULL,
LastName CHAR(30) NOT NULL,
GenderCode CHAR(3) NOT NULL,
BirthDate DATE NOT NULL,
CreatedDateTime TIMESTAMP NOT NULL,
CONSTRAINT PK_Person PRIMARY KEY (PersonId),
CONSTRAINT AK_Person_FirstName_LastName_Gender_and_BirthDate UNIQUE -- Composite ALTERNATE KEY.
(
FirstName,
LastName,
GenderCode,
BirthDate
)
);
CREATE TABLE Disease
(
DiseaseNumber INT NOT NULL,
Name CHAR(30) NOT NULL,
ParticularColumn CHAR(30) NOT NULL,
SpecificColumn CHAR(30) NOT NULL,
Etcetera CHAR(30) NOT NULL,
CreatedDateTime TIMESTAMP NOT NULL,
CONSTRAINT PK_Disease PRIMARY KEY (DiseaseNumber),
CONSTRAINT AK_Disease_Name UNIQUE (Name)-- ALTERNATE KEY.
);
CREATE TABLE PersonDisease -- Associative table.
(
PersonId INT NOT NULL,
DiseaseNumber INT NOT NULL,
DiagnosedDate DATE NOT NULL,
CreatedDateTime TIMESTAMP NOT NULL,
CONSTRAINT PK_PersonDisease PRIMARY KEY (PersonId, DiseaseNumber), -- Composite PRIMARY KEY.
CONSTRAINT FK_from_PersonDisease_to_Person FOREIGN KEY (PersonId)
REFERENCES Person (PersonId),
CONSTRAINT FK_from_PersonDisease_to_Disease FOREIGN KEY (DiseaseNumber)
REFERENCES Disease (DiseaseNumber)
);
Run Code Online (Sandbox Code Playgroud)
This structure permits
Disease instances (via simple INSERT operations) andPerson instances with n Disease instances.Disease data independently of that which pertains to Person.And avoids
Sample data
Let us suposse that the relevant database holds the data that follows:
-- 'Person' sample data.
INSERT INTO
person (PersonId, FirstName, LastName, GenderCode, BirthDate, CreatedDateTime)
VALUES
(1, 'David', 'Smith', 'M', CURRENT_DATE, '20161031 13:14:02');
INSERT INTO
Person (PersonId, FirstName, LastName, GenderCode, BirthDate, CreatedDateTime)
VALUES
(2, 'Nicole', 'Johnson', 'F', CURRENT_DATE, '20161031 13:14:02');
-- 'Disease' sample data.
INSERT INTO
Disease (DiseaseNumber, Name, ParticularColumn, SpecificColumn, Etcetera, CreatedDateTime)
VALUES
(1, 'Disease A', 'Particular test', 'Specific test', 'Etcetera test', '20161031 13:14:02');
INSERT INTO
Disease (DiseaseNumber, Name, ParticularColumn, SpecificColumn, Etcetera, CreatedDateTime)
VALUES
(2, 'Disease B', 'Particular test', 'Specific test', 'Etcetera test', '20161031 13:14:02');
INSERT INTO
Disease (DiseaseNumber, Name, ParticularColumn, SpecificColumn, Etcetera, CreatedDateTime)
VALUES
(3, 'Disease C', 'Particular test', 'Specific test', 'Etcetera test', '20161031 13:14:02');
INSERT INTO
Disease (DiseaseNumber, Name, ParticularColumn, SpecificColumn, Etcetera, CreatedDateTime)
VALUES
(4, 'Disease D', 'Particular test', 'Specific test', 'Etcetera test', '20161031 13:14:02');
-- 'PersonDisease' sample data.
INSERT INTO
PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES
(1, 1, '20161031', '20161031 13:14:02');
INSERT INTO
PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES
(1, 2, '20161031', '20161031 13:14:02');
INSERT INTO
PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES
(2, 1, '20161031', '20161031 13:14:02');
INSERT INTO
PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES
(2, 2, '20161031', '20161031 13:14:02');
INSERT INTO
PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES
(2, 3, '20161031', '20161031 13:14:02');
INSERT INTO
PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES
(2, 4, '20161031', '20161031 13:14:02');
Run Code Online (Sandbox Code Playgroud)
So you might like to retrieve all the diseases by which all the persons are affected via, e.g., the statement displayed bellow:
SELECT Person.FirstName AS "Patient first name",
Person.LastName AS "Patient last name",
Disease.Name AS "Disease name",
PersonDisease.DiagnosedDate AS "Diagnosed date"
FROM PersonDisease
JOIN Disease
ON Disease.DiseaseNumber = PersonDisease.DiseaseNumber
JOIN Person
ON Person.PersonId = PersonDisease.PersonId;
Run Code Online (Sandbox Code Playgroud)
As is also exemplified in this SQL Fiddle.
Inspecting all the other diseases suffered by persons that have heart disease
Let us assume that heart disease is primarily identified by disease number 3, so you can construct a SELECT statement like, e.g., the following one to see what other diseases are suffered by people with heart disease:
SELECT PersonDisease.PersonId,
Person.FirstName,
Person.LastName,
PersonDisease.DiseaseNumber,
Disease.Name
FROM PersonDisease
JOIN Person
ON Person.PersonId = PersonDisease.PersonId
JOIN Disease
ON Disease.DiseaseNumber = PersonDisease.DiseaseNumber
WHERE Person.PersonId IN (SELECT PersonId
FROM PersonDisease
WHERE DiseaseNumber = 3)
AND PersonDisease.DiseaseNumber <> 3;
Run Code Online (Sandbox Code Playgroud)
You can review a comparable method in action in this new SQL Fiddle.
Of course, this and the query discussed in the preceding section can be defined as VIEWs so that you can SELECT data directly FROM them, if necessary.
There might well be other approaches to solve this requirement like, e.g., comparing the data stored in PersonDisease with a VIEW that includes only the persons who have heart disease, or a CTE as suggested by @Joishi Bodio in their answer. Some of them could be served by faster execution plans than the ones that correspond to the other approaches, so you should naturally make some tests and implement a reliable INDEXing strategy.
Using the CROSSTAB() Function
It appears that you want to construct some kind of PIVOT query, so you may find of help this Stack Overflow answer in which @Erwin Brandsteter details several usages of the CROSSTAB() function that, according to the PostgreSQL documentation, is included in the tablefunc module.
Answer focus
The objective should be, as discussed above, to avoid appending the superfluous disease-specific base table.
1. 对信息建模集成定义(IDEF1X)是被确立为一个非常可取的数据建模技术标准由1993年12月美国标准与技术研究所(NIST)。它坚实基础的(一)早期的理论工作撰写由始发的的关系模型,即EF科德博士; 关于 (b)实体关系视图,由PP Chen 博士开发;以及 (c) 逻辑数据库设计技术,由 Robert G. Brown 创建。
| 归档时间: |
|
| 查看次数: |
189 次 |
| 最近记录: |