使用NOT IN的DELETE的性能(选择...)

Seb*_*bas 5 sql postgresql postgresql-performance sql-delete

我有这两个表,并希望从ms_author中删除所有不在author中的作者。

author (160万行)

+-------+-------------+------+-----+-------+
| Field | Type        | Null | Key | index |
+-------+-------------+------+-----+-------+
| id    | text        | NO   | PRI | true  |
| name  | text        | YES  |     |       |
+-------+-------------+------+-----+-------+
Run Code Online (Sandbox Code Playgroud)

ms_author (1.2亿行)

+-------+-------------+------+-----+-------+
| Field | Type        | Null | Key | index |
+-------+-------------+------+-----+-------+
| id    | text        | NO   | PRI |       |
| name  | text        | YES  |     | true  |
+-------+-------------+------+-----+-------+
Run Code Online (Sandbox Code Playgroud)

这是我的查询:

    DELETE
FROM ms_author AS m
WHERE m.name NOT IN
                   (SELECT a.name
                    FROM author AS a);
Run Code Online (Sandbox Code Playgroud)

我试图估计查询持续时间:〜130小时。
有没有更快的方法来实现这一目标?

编辑:

EXPLAIN VERBOSE 输出

Delete on public.ms_author m  (cost=0.00..2906498718724.75 rows=59946100 width=6)"
  ->  Seq Scan on public.ms_author m  (cost=0.00..2906498718724.75 rows=59946100 width=6)"
        Output: m.ctid"
        Filter: (NOT (SubPlan 1))"
        SubPlan 1"
          ->  Materialize  (cost=0.00..44334.43 rows=1660295 width=15)"
                Output: a.name"
                ->  Seq Scan on public.author a  (cost=0.00..27925.95 rows=1660295 width=15)"
                      Output: a.name"
Run Code Online (Sandbox Code Playgroud)

索引作者(name):

create index author_name on author(name);
Run Code Online (Sandbox Code Playgroud)

索引ms_author(name):

create index ms_author_name on ms_author(name);
Run Code Online (Sandbox Code Playgroud)

Ham*_*one 5

我是“反加入”的忠实拥护者。这对于大型和小型数据集都有效:

delete from ms_author ma
where not exists (
  select null
  from author a
  where ma.name = a.name
)
Run Code Online (Sandbox Code Playgroud)

  • 这就是要走的路。“NOT IN (SELECT ...)”是一个棘手的子句。通常,有[更好的替代方案](http://stackoverflow.com/a/19364694/939860)。 (2认同)