我需要在同一个表中查找可能匹配的客户记录。逻辑如下。然而,这似乎在 O(N²) 下执行。有没有办法提高这里的性能?我试过设置索引、散列列并进行比较等,但在大型数据集上的性能仍然很糟糕。我还在下面添加了查询计划。
SELECT
C1.CustomerId AS Customer1,
C2.CustomerId AS Customer2
FROM Customer C1
INNER JOIN Customer C2
ON
C1.CustomerId != C2.CustomerId
AND
(C1.FirstName = C2.FirstName OR C1.BirthDate = C2.BirthDate)
AND
(
C1.EmailAddress = C2.EmailAddress
OR
C1.MobilePhoneNumber = C2.MobilePhoneNumber
OR
(
C1.HomeAddressLine1 = C2.HomeAddressLine1
AND
(
C1.HomePostCode = C2.HomePostCode
OR
C1.HomeSuburb = C2.HomeSuburb
)
)
)
Run Code Online (Sandbox Code Playgroud)