比较 ID，如果为真则更新另一个表

Question

比较 ID，如果为真则更新另一个表

我有3张桌子。1 表（表 A）有某种疾病（例如心脏病）的患者，另一个表是有各种疾病的患者（表 B）。第三个表是结果表。

我想要做的是将表 A 中的心脏病患者（表 A 仅是心脏病患者）与表 B 中的肾脏疾病患者进行比较。如果表 B 中存在命中（返回结果），则更新结果表该 ID 列肾脏疾病下的“x”。

结果表目前采用以下格式：

ID|肾脏|CHD|糖尿病……还有另外10种疾病

总结：我们用指定的疾病名称检查表 A 与表 B，如果有返回结果patientid，则结果表更新为该特定疾病名称下的“x” patientID。然后我将重新运行查询更改疾病名称（例如从肾脏到癌症）以查明心脏病患者是否患有癌症？

样本数据

表A

病人编号
1
2
3
4
5
....一路到812（812心脏病患者）

表B

患者 ID|年龄|疾病 1|疾病 2|疾病 3...一直到疾病 10
1 50 肾肺
2 35 肾心糖尿病
3 94 癌性肺炎 CHD

想要发生的是，对于PatientID表 A 中的每个，我们将表 B 中疾病 1-10 的特定疾病（例如肾脏）与表 B 进行比较，如果有结果，我们然后用'更新结果表x' 在结果表中该特定疾病列下。因此，在我们的示例中，对于患者 1 和 2，肾脏下方应该有一个“x”。

使用我们的样本数据，当我们运行肾脏时，结果表应该如下所示：

ID|肾脏|冠心病|糖尿病
 1 次
 2 x

谁能指出我为此编写查询的方向？该查询patientID从表 A 中检查所有（其中 812 个）表 B 中的特定疾病，如果它返回结果，则更新该疾病和结果表中带有“x”的患者。

如果有更好的方法来做到这一点，请告诉我如何去做！

Answer 1

MDC*_*CCL 6

I propose implementing a different and more versatile database structure, because the addition of a (base) table that retains persons or patients that are affected by only one particular disease introduces update anomalies and is unnecessary.

Business rules

I consider that the following business rules formulations are of prime importance:

A Person is affected by zero-one-or-many Diseases
A Disease affects one-to-many Persons

Therefore, you want to implement a common many-to-many (M:N) relationship.

Illustrative logical model

Then, from such formulations, I have derived an IDEF1X¹ logical model that is shown in Figure 1:

图 1 - 人和疾病的逻辑模型

As demonstrated, it is very important to define the entity types that are involved in the business domain under consideration, their attributes and how they are interrelated (including the cardinality of the relevant relationships).

In this case Person and Disease are independent entity types, each with their own attributes, and they are interrelated by way of an associative entity type, that I called PersonDisease, which holds the attributes that pertain to the relationship that takes place between the two mentioned entity types.

Each PersonDisease occurrence (or row once it is stored in a SQL database) is uniquely identified by PersonId along with DiseaseNumber, so I fixed this combination of attributes as the PRIMARY KEY of this associative entity type.

Derivable data

This model indicates that the Table B included in your answer is derivable data (obtained via DML operations that combine or compute columns from multiple tables), and not an entity type, neither independent nor associative; therefore you should not set it up as a base table.

Expository implementation

Consequently, I created the following DDL structure in terms of the logical model discussed above:

-- You should determine which are the most fitting 
-- data types and sizes for all your table columns 
-- depending on your business context characteristics.

-- Also, you should carry out some testing sessions to define 
-- the most convenient INDEXing strategies.

-- As one would expect, you are free to make use of 
-- your preferred (or required) naming conventions.

CREATE TABLE Person
(
    PersonId        INT       NOT NULL,
    FirstName       CHAR(30)  NOT NULL,
    LastName        CHAR(30)  NOT NULL,
    GenderCode      CHAR(3)   NOT NULL,
    BirthDate       DATE      NOT NULL,
    CreatedDateTime TIMESTAMP NOT NULL,
    CONSTRAINT PK_Person PRIMARY KEY (PersonId),
    CONSTRAINT AK_Person_FirstName_LastName_Gender_and_BirthDate UNIQUE -- Composite ALTERNATE KEY.
    (
        FirstName,
        LastName,
        GenderCode,
        BirthDate
    )
);

CREATE TABLE Disease
(
    DiseaseNumber    INT       NOT NULL,
    Name             CHAR(30)  NOT NULL,
    ParticularColumn CHAR(30)  NOT NULL,
    SpecificColumn   CHAR(30)  NOT NULL,
    Etcetera         CHAR(30)  NOT NULL,
    CreatedDateTime  TIMESTAMP NOT NULL,
    CONSTRAINT PK_Disease      PRIMARY KEY (DiseaseNumber),
    CONSTRAINT AK_Disease_Name UNIQUE      (Name)-- ALTERNATE KEY.
);

CREATE TABLE PersonDisease -- Associative table.
(
    PersonId        INT       NOT NULL,
    DiseaseNumber   INT       NOT NULL,
    DiagnosedDate   DATE      NOT NULL,
    CreatedDateTime TIMESTAMP NOT NULL,
    CONSTRAINT PK_PersonDisease                 PRIMARY KEY (PersonId, DiseaseNumber), -- Composite PRIMARY KEY.
    CONSTRAINT FK_from_PersonDisease_to_Person  FOREIGN KEY (PersonId)
        REFERENCES Person  (PersonId),
    CONSTRAINT FK_from_PersonDisease_to_Disease FOREIGN KEY (DiseaseNumber)
        REFERENCES Disease (DiseaseNumber)
);

Run Code Online (Sandbox Code Playgroud)

This structure permits

storing n Disease instances (via simple INSERT operations) and
storing the relationship between n Person instances with n Disease instances.
managing Disease data independently of that which pertains to Person.

And avoids

employing ad hoc procedures that have to be carried out due to the update anomalies introduced by an additional disease-specific table.

Sample data

Let us suposse that the relevant database holds the data that follows:

-- 'Person' sample data.   
INSERT INTO 
    person (PersonId, FirstName, LastName, GenderCode, BirthDate, CreatedDateTime)
VALUES 
    (1, 'David', 'Smith', 'M', CURRENT_DATE, '20161031 13:14:02');

INSERT INTO 
    Person (PersonId, FirstName, LastName, GenderCode, BirthDate, CreatedDateTime)
VALUES 
    (2, 'Nicole', 'Johnson', 'F', CURRENT_DATE, '20161031 13:14:02');


-- 'Disease' sample data.
INSERT INTO 
    Disease (DiseaseNumber, Name, ParticularColumn, SpecificColumn, Etcetera, CreatedDateTime)
VALUES 
   (1, 'Disease A', 'Particular test', 'Specific test', 'Etcetera test', '20161031 13:14:02');

INSERT INTO 
    Disease (DiseaseNumber, Name, ParticularColumn, SpecificColumn, Etcetera, CreatedDateTime)
VALUES 
    (2, 'Disease B', 'Particular test', 'Specific test', 'Etcetera test', '20161031 13:14:02');

INSERT INTO 
    Disease (DiseaseNumber, Name, ParticularColumn, SpecificColumn, Etcetera, CreatedDateTime)
VALUES 
    (3, 'Disease C', 'Particular test', 'Specific test', 'Etcetera test', '20161031 13:14:02');

INSERT INTO 
    Disease (DiseaseNumber, Name, ParticularColumn, SpecificColumn, Etcetera, CreatedDateTime)
VALUES 
    (4, 'Disease D', 'Particular test', 'Specific test', 'Etcetera test', '20161031 13:14:02');

-- 'PersonDisease' sample data.

INSERT INTO 
    PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES 
   (1, 1, '20161031', '20161031 13:14:02');

INSERT INTO 
    PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES 
    (1, 2, '20161031', '20161031 13:14:02');

INSERT INTO 
    PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES 
    (2, 1, '20161031', '20161031 13:14:02');

INSERT INTO 
    PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES 
   (2, 2, '20161031', '20161031 13:14:02');

INSERT INTO 
    PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES 
    (2, 3, '20161031', '20161031 13:14:02');

INSERT INTO 
    PersonDisease (PersonId, DiseaseNumber, DiagnosedDate, CreatedDateTime)
VALUES 
(2, 4, '20161031', '20161031 13:14:02');

Run Code Online (Sandbox Code Playgroud)

So you might like to retrieve all the diseases by which all the persons are affected via, e.g., the statement displayed bellow:

SELECT Person.FirstName            AS "Patient first name",
       Person.LastName             AS "Patient last name",
       Disease.Name                AS "Disease name",
       PersonDisease.DiagnosedDate AS "Diagnosed date"
  FROM PersonDisease
  JOIN Disease
    ON Disease.DiseaseNumber = PersonDisease.DiseaseNumber
  JOIN Person
    ON Person.PersonId       = PersonDisease.PersonId;

Run Code Online (Sandbox Code Playgroud)

As is also exemplified in this SQL Fiddle.

Inspecting all the other diseases suffered by persons that have heart disease

Let us assume that heart disease is primarily identified by disease number 3, so you can construct a SELECT statement like, e.g., the following one to see what other diseases are suffered by people with heart disease:

SELECT PersonDisease.PersonId,
       Person.FirstName,
       Person.LastName,
       PersonDisease.DiseaseNumber,
       Disease.Name
  FROM PersonDisease
  JOIN Person
    ON Person.PersonId              = PersonDisease.PersonId
  JOIN Disease
    ON Disease.DiseaseNumber        = PersonDisease.DiseaseNumber
 WHERE Person.PersonId              IN (SELECT PersonId
                                          FROM PersonDisease
                                         WHERE DiseaseNumber = 3)
   AND PersonDisease.DiseaseNumber <> 3;

Run Code Online (Sandbox Code Playgroud)

You can review a comparable method in action in this new SQL Fiddle.

Of course, this and the query discussed in the preceding section can be defined as VIEWs so that you can SELECT data directly FROM them, if necessary.

There might well be other approaches to solve this requirement like, e.g., comparing the data stored in PersonDisease with a VIEW that includes only the persons who have heart disease, or a CTE as suggested by @Joishi Bodio in their answer. Some of them could be served by faster execution plans than the ones that correspond to the other approaches, so you should naturally make some tests and implement a reliable INDEXing strategy.

Using the CROSSTAB() Function

It appears that you want to construct some kind of PIVOT query, so you may find of help this Stack Overflow answer in which @Erwin Brandsteter details several usages of the CROSSTAB() function that, according to the PostgreSQL documentation, is included in the tablefunc module.

Answer focus

The objective should be, as discussed above, to avoid appending the superfluous disease-specific base table.

Endnote

_{1. 对信息建模集成定义（IDEF1X）是被确立为一个非常可取的数据建模技术标准由1993年12月美国标准与技术研究所（NIST）。它坚实基础的（一）早期的理论工作撰写由始发的的关系模型，即EF科德博士; 关于 (b)实体关系视图，由PP Chen 博士开发；以及 (c) 逻辑数据库设计技术，由 Robert G. Brown 创建。}

归档时间：	9 年前
查看次数：	189 次
最近记录：	9 年前