在 ClickHouse 中连接大表:内存不足或速度慢

aim*_*eld 0 join materialized-views clickhouse

我有 3 个大表(>100 GB,每个表有数百万行):eventspage_viewssessions。这些表通过 1-n 关系连接,请参见下面的表设置。我正在尝试创建一个非规范化events_wide表,其中包含每个事件的一行,其中相应的page_viewssessions列被连接。这个想法是消除复杂分析查询所需的联接,因为这些联接速度很慢。

我创建了一个物化视图,它将和表events_mv连接到事件表。每当将新事件插入到 中时,物化视图应将一行插入到 中,自动连接 page_view 和会话。但是,当我插入一个新事件时,查询要么无法完成,要么因内存不足错误而终止。page_viewssessionseventsevents_wide

events即使从到运行这个简单的连接查询也会page_views导致内存不足错误:Memory limit (for user) exceeded: would use 99.21 GiB。我使用具有 24+ GB RAM 的 ClickHouse Cloud 生产实例:

SELECT
    -- Select columns from events and page_views
FROM events AS e
LEFT JOIN page_views AS p ON p.property_id = e.property_id AND p.id = e.page_view_id
LIMIT 3;
Run Code Online (Sandbox Code Playgroud)

我尝试了 3 个表的不同主键排序(property_id, created_at, id)(property_id, id, created_at)不同的连接算法(partial_mergeautograce_hash)、ANY LEFT JOIN、 ,但没有成功。也许使用 UUID 而不是数字 ID 是问题的一部分,但不幸的是我无法更改 UUID。

这是我的表设置与(property_id, id, created_at)主键:

CREATE TABLE events
(
    id UUID,
    created_at DateTime('UTC'),
    property_id Int,
    page_view_id Nullable(UUID),
    session_id Nullable(UUID),
    ...
) ENGINE = ReplacingMergeTree()
PARTITION BY toYYYYMM(created_at)
PRIMARY KEY (property_id, id, created_at)
ORDER BY (property_id, id, created_at);

CREATE TABLE page_views
(
    id UUID,
    created_at DateTime('UTC'),
    modified_at DateTime('UTC'),
    session_id Nullable(UUID),
    ...
) ENGINE = ReplacingMergeTree(modified_at)
PARTITION BY toYYYYMM(created_at)
PRIMARY KEY (property_id, id, created_at)
ORDER BY (property_id, id, created_at);

CREATE TABLE sessions
(
    id UUID,
    created_at DateTime('UTC'),
    modified_at DateTime('UTC'),
    property_id Int,
    ...
) ENGINE = ReplacingMergeTree(modified_at)
PARTITION BY toYYYYMM(created_at)
PRIMARY KEY (property_id, id, created_at)
ORDER BY (property_id, id, created_at);


CREATE TABLE events_wide
(
    id UUID,
    created_at DateTime('UTC'),
    property_id Int,
    page_view_id Nullable(UUID),
    session_id Nullable(UUID),
    ...
    -- page_views columns
    p_created_at DateTime('UTC'),
    p_modified_at DateTime('UTC'),
    ...
    -- sessions columns
    s_created_at DateTime('UTC'),
    s_modified_at DateTime('UTC'),
    ...
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(created_at)
PRIMARY KEY (property_id, created_at)
ORDER BY (property_id, created_at, id);


CREATE MATERIALIZED VIEW events_mv TO events_wide AS
SELECT
    e.id AS id,
    e.created_at AS created_at,
    e.session_id AS session_id,
    e.property_id AS property_id,
    e.page_view_id AS page_view_id,
    ...
    -- page_views columns
    p.created_at AS p_created_at,
    p.modified_at AS p_modified_at,
    ...
    -- sessions columns
    s.created_at AS s_created_at,
    s.modified_at AS s_modified_at ,
    ...
FROM events AS e
LEFT JOIN page_views AS p ON p.property_id = e.property_id AND p.id = e.page_view_id
LEFT JOIN sessions AS s ON s.property_id = e.property_id AND s.id = e.session_id
SETTINGS join_algorithm = 'partial_merge';
Run Code Online (Sandbox Code Playgroud)

dec*_*cay 7

ClickHouse 没有合适的优化器,因此在执行联接之前需要过滤联接的正确表。否则,完整的表将被推送到内存中以执行连接,从而导致您遇到的问题。

使用您提供的示例:

WITH events_block AS (
    SELECT * FROM events LIMIT 3
)
SELECT e.*, p.* FROM events_block AS e
LEFT JOIN (
    SELECT * FROM page_views
    WHERE (property_id, id) IN (
        SELECT property_id, page_view_id FROM events_block
    )
) AS p ON p.property_id = e.property_id AND p.id = e.page_view_id;
Run Code Online (Sandbox Code Playgroud)

如果您考虑单个连接操作,但物化视图是按块处理的,这可能看起来很奇怪,这将防止每次都移动到内存完全正确的表。

因此,按如下方式重写物化视图即可解决问题:

CREATE MATERIALIZED VIEW events_mv TO events_wide AS
SELECT
    e.id AS id,
    e.created_at AS created_at,
    e.session_id AS session_id,
    e.property_id AS property_id,
    e.page_view_id AS page_view_id,
    ...
    -- page_views columns
    p.created_at AS p_created_at,
    p.modified_at AS p_modified_at,
    ...
    -- sessions columns
    s.created_at AS s_created_at,
    s.modified_at AS s_modified_at,
    ...
FROM events AS e
LEFT JOIN (
    SELECT * FROM page_views
    WHERE (property_id, id) IN (
        SELECT property_id, page_view_id FROM events
    )
) AS p ON p.property_id = e.property_id AND p.id = e.page_view_id
LEFT JOIN (
    SELECT * FROM sessions
    WHERE (property_id, id) IN (
        SELECT property_id, session_id FROM events
    )
) AS s ON s.property_id = e.property_id AND s.id = e.session_id
Run Code Online (Sandbox Code Playgroud)