如何只用postgres连接连接表中的一行?

Ben*_*ier 35 sql postgresql join

我有以下架构:

CREATE TABLE author (
    id   integer
  , name varchar(255)
);
CREATE TABLE book (
    id        integer
  , author_id integer
  , title     varchar(255)
  , rating    integer
);
Run Code Online (Sandbox Code Playgroud)

我希望每个作者都有它的最后一本书:

SELECT book.id, author.id, author.name, book.title as last_book
FROM author
JOIN book book ON book.author_id = author.id

GROUP BY author.id
ORDER BY book.id ASC
Run Code Online (Sandbox Code Playgroud)

显然你可以在mysql中做到这一点:在MySQL中加入两个表,从第二个表中只返回一行.

但是postgres给出了这个错误:

错误:列"book.id"必须出现在GROUP BY子句中或用于聚合函数:SELECT book.id,author.id,author.name,book.title as last_book FROM author JOIN book book ON book.author_id = author.id GROUP BY author.id ORDER BY book.id ASC

这是因为:

当GROUP BY存在时,SELECT列表表达式无法引用除聚合函数之外的未分组列,因为对于未分组列,将返回多个可能的值.

我怎样才能指定为postgres:" joined_table.id在联合表中只给我排序的最后一行?"


编辑:使用此数据:

INSERT INTO author (id, name) VALUES
  (1, 'Bob')
, (2, 'David')
, (3, 'John');

INSERT INTO book (id, author_id, title, rating) VALUES
  (1, 1, '1st book from bob', 5)
, (2, 1, '2nd book from bob', 6)
, (3, 1, '3rd book from bob', 7)
, (4, 2, '1st book from David', 6)
, (5, 2, '2nd book from David', 6);
Run Code Online (Sandbox Code Playgroud)

我应该看到:

book_id author_id name    last_book
3       1         "Bob"   "3rd book from bob"
5       2         "David" "2nd book from David"
Run Code Online (Sandbox Code Playgroud)

Clo*_*eto 47

select distinct on (author.id)
    book.id, author.id, author.name, book.title as last_book
from
    author
    inner join
    book on book.author_id = author.id
order by author.id, book.id desc
Run Code Online (Sandbox Code Playgroud)

校验 distinct on

SELECT DISTINCT ON(expression [,...])仅保留给定表达式求值的每组行的第一行.使用与ORDER BY相同的规则解释DISTINCT ON表达式(参见上文).请注意,除非使用ORDER BY确保首先显示所需的行,否则每个集合的"第一行"都是不可预测的.

有明显的,有必要包括"不同"列order by.如果这不是您想要的顺序,那么您需要包装查询并重新排序

select 
    *
from (
    select distinct on (author.id)
        book.id, author.id, author.name, book.title as last_book
    from
        author
        inner join
        book on book.author_id = author.id
    order by author.id, book.id desc
) authors_with_first_book
order by authors_with_first_book.name
Run Code Online (Sandbox Code Playgroud)

另一个解决方案是使用Lennart的答案中的窗口函数.另一个非常通用的是这个

select 
    book.id, author.id, author.name, book.title as last_book
from
    book
    inner join
    (
        select author.id as author_id, max(book.id) as book_id
        from
            author
            inner join
            book on author.id = book.author_id
        group by author.id
    ) s
    on s.book_id = book.id
    inner join
    author on book.author_id = author.id
Run Code Online (Sandbox Code Playgroud)

  • 做这份工作。`distinct on` 是 postgres 特有的。如果有另一种方式,我很高兴知道。 (2认同)
  • `distinct on` 是很酷的功能,但请记住它会导致排序,如果可以在内存中执行则很好。一旦子查询中的数据集变大,排序涉及磁盘操作(临时文件将写入磁盘以进行排序) (2认同)

小智 10

我为聊天系统做了类似的事情,其中​​房间保存元数据,列表包含消息。我最终使用了 Postgresql LATERAL JOIN,它就像一个魅力。

SELECT MR.id AS room_id, MR.created_at AS room_created, 
    lastmess.content as lastmessage_content, lastmess.datetime as lastmessage_when
FROM message.room MR
    LEFT JOIN LATERAL (
        SELECT content, datetime
        FROM message.list
        WHERE room_id = MR.id
        ORDER BY datetime DESC 
        LIMIT 1) lastmess ON true
ORDER BY lastmessage_when DESC NULLS LAST, MR.created_at DESC
Run Code Online (Sandbox Code Playgroud)

有关更多信息,请参阅https://heap.io/blog/engineering/postgresqls-powerful-new-join-type-lateral


Tao*_*hok 9

您可以在联接中添加一条规则以仅指定一行。我有工作要做。

像这样:

SELECT 
    book.id, 
    author.id, 
    author.name, 
    book.title as last_book
FROM author auth1
JOIN book book ON (book.author_id = auth1.id AND book.id = (select max(b.id) from book b where b.author_id = auth1))
GROUP BY auth1.id
ORDER BY book.id ASC
Run Code Online (Sandbox Code Playgroud)

这样就可以从ID较高的书中获取数据。您可以添加“日期”并与 max(date) 进行相同的操作。


wil*_*ser 6

这可能看起来过时且过于简单,但它不依赖于窗口函数,CTE和聚合子查询.在大多数情况下,它也是最快的.

SELECT bk.id, au.id, au.name, bk.title as last_book
FROM author au
JOIN book bk ON bk.author_id = au.id
WHERE NOT EXISTS (
    SELECT *
    FROM book nx
    WHERE nx.author_id = bk.author_id
    AND nx.book_id > bk.book_id
    )
ORDER BY book.id ASC
    ;
Run Code Online (Sandbox Code Playgroud)

  • 反连接方法比这里为我建议的其他解决方案更快。始终使用“解释分析”进行检查。这也让我感到惊讶@Ajax。 (4认同)
  • `EXISTS()` 比 sql92 样式的连接更旧。在外连接存在之前,我们必须使用`select ... from a where not exists(select ... from b where ...) union all select ... from a,b where ...` 来构建它们为什么开发人员投入大量精力来实现它们。如果可用,索引用于实现反连接。[在大多数平台上] (2认同)

Len*_*art 5

这是一种方法:

SELECT book_id, author_id, author_name, last_book
FROM (
    SELECT b.id as book_id
         , a.id as author_id
         , a.name as author_name
         , b.title as last_book
         , row_number() over (partition by a.id
                              order by b.id desc) as rn
    FROM author a
    JOIN book b 
        ON b.author_id = a.id
) last_books
WHERE rn = 1;
Run Code Online (Sandbox Code Playgroud)