为什么我的SQL'NOT IN'子句会产生与'NOT EXISTS'不同的结果

Ste*_*eet 3 sql sql-server-2005

当我希望它们产生相同的结果时,我有两个SQL查询产生不同的结果.我试图找到没有相应位置的事件数.所有位置都有一个事件,但事件也可以链接到非位置记录.

以下查询产生的计数为16244,即正确的值.

SELECT COUNT(DISTINCT e.event_id)   
FROM   events AS e   
WHERE  NOT EXISTS   
  (SELECT * FROM locations AS l WHERE l.event_id = e.event_id)    
Run Code Online (Sandbox Code Playgroud)

以下查询生成计数0.

SELECT COUNT(DISTINCT e.event_id) 
FROM   events AS e
WHERE  e.event_id NOT IN (SELECT  l.event_id FROM locations AS l)
Run Code Online (Sandbox Code Playgroud)

以下SQL对数据集进行了一些摘要

SELECT  'Event Count', 
        COUNT(DISTINCT event_id) 
        FROM events

UNION ALL

SELECT  'Locations Count', 
        COUNT(DISTINCT event_id) 
        FROM locations

UNION ALL

SELECT  'Event+Location Count', 
        COUNT(DISTINCT l.event_id) 
        FROM locations AS l  JOIN events AS e ON l.event_Id = e.event_id
Run Code Online (Sandbox Code Playgroud)

并返回以下结果

Event Count         139599
Locations Count         123355
Event+Location Count    123355

任何人都可以解释为什么2个初始查询不会产生相同的数字.

Mar*_*ith 6

子查询中有NULL,SELECT l.event_id FROM locations AS l因此NOT IN 将始终计算为未知并返回0结果

SELECT COUNT(DISTINCT e.event_id) 
FROM   events AS e
WHERE  e.event_id NOT IN (SELECT  l.event_id FROM locations AS l)
Run Code Online (Sandbox Code Playgroud)

从下面的示例可以看出这种行为的原因.

'x'不是(NULL,'a','b')

≡'x'<> NULL和'x'<>'a'和'x'<>'b'

≡未知,真实和真实

≡未知


Mit*_*eat 5

NOT IN表的工作方式不同的空值.单个的存在NULL将导致整个语句失败,从而不返回任何结果.

所以,你至少有一个event_idlocationsNULL.

此外,您的查询可能更好地写为连接:

SELECT 
    COUNT(DISTINCT e.event_id)    
FROM
    events AS e  
    LEFT JOIN locations AS l ON e.event_id = l.event_id
WHERE
    l.event_id IS NULL
Run Code Online (Sandbox Code Playgroud)

[更新:显然,NOT EXISTS版本更快.]