Clickhouse 有条件加入

ogb*_*jnr 0 sql clickhouse

我发现奇怪的事情,查询:

SELECT *
FROM progress as pp
ALL LEFT JOIN links as ll USING (viewId)
WHERE viewId = 'a776a2f2-16ad-448a-858d-891e68bec9a8' 
Run Code Online (Sandbox Code Playgroud)

结果:0 rows in set. Elapsed: 5.267 sec. Processed 8.62 million rows, 484.94 MB (1.64 million rows/s., 92.08 MB/s.)

这里修改了查询:

SELECT *
FROM
  (SELECT *
   FROM progress
   WHERE viewId = 'a776a2f2-16ad-448a-858d-891e68bec9a8') AS p ALL
LEFT JOIN
  (SELECT *
   FROM links
   WHERE viewId = toUUID('a776a2f2-16ad-448a-858d-891e68bec9a8')) AS l ON p.viewId = l.viewId;
Run Code Online (Sandbox Code Playgroud)

结果 :0 rows in set. Elapsed: 0.076 sec. Processed 4.48 million rows, 161.35 MB (58.69 million rows/s., 2.12 GB/s.)

但看起来很脏。

难道不应该考虑where条件来优化查询吗?

在此处编写查询的正确方法是什么?如果在何处编写查询又如何?

然后我尝试添加另一个连接:

SELECT *
FROM
  (SELECT videoUuid AS contentUuid,
          viewId
   FROM
     (SELECT *
      FROM progress
      WHERE viewId = 'a776a2f2-16ad-448a-858d-891e68bec9a8') p ALL
   LEFT JOIN
     (SELECT *
      FROM links
      WHERE viewId = toUUID('a776a2f2-16ad-448a-858d-891e68bec9a8')) USING `viewId`) ALL
LEFT JOIN `metaInfo` USING `viewId`,
                           `contentUuid`;
Run Code Online (Sandbox Code Playgroud)

考虑到我只想将 3 个表与条件选择一行连接起来,结果又非常慢:

0 rows in set. Elapsed: 1.747 sec. Processed 9.13 million rows, 726.55 MB (5.22 million rows/s., 415.85 MB/s.)

vla*_*mir 6

目前,CH 还不能很好地处理多连接查询(DB 星型模式),并且查询优化器还不够好,无法完全依赖它。

\n

因此它需要明确说明如何使用子查询而不是联接来“执行”查询。

\n

考虑测试查询:

\n
SELECT table_01.number AS r\nFROM numbers(87654321) AS table_01\n  INNER JOIN numbers(7654321) AS table_02 ON (table_01.number = table_02.number)\n  INNER JOIN numbers(654321) AS table_03 ON (table_02.number = table_03.number)\n  INNER JOIN numbers(54321) AS table_04 ON (table_03.number = table_04.number)\nWHERE r = 54320\n/*\n\xe2\x94\x8c\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80r\xe2\x94\x80\xe2\x94\x90\n\xe2\x94\x82 54320 \xe2\x94\x82\n\xe2\x94\x94\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x98\n\n1 rows in set. Elapsed: 6.261 sec. Processed 96.06 million rows, 768.52 MB (15.34 million rows/s., 122.74 MB/s.)\n*/\n
Run Code Online (Sandbox Code Playgroud)\n

让我们使用子查询重写它以显着加快速度。

\n
SELECT number AS r\nFROM numbers(87654321)\nWHERE r = 54320 AND number IN (\n  SELECT number AS r\n  FROM numbers(7654321)\n  WHERE r = 54320 AND number IN (\n    SELECT number AS r\n    FROM numbers(654321)\n    WHERE r = 54320 AND number IN (\n      SELECT number AS r\n      FROM numbers(54321)\n      WHERE r = 54320\n    )\n  )\n)\n/*\n\xe2\x94\x8c\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80r\xe2\x94\x80\xe2\x94\x90\n\xe2\x94\x82 54320 \xe2\x94\x82\n\xe2\x94\x94\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x98\n\n1 rows in set. Elapsed: 0.481 sec. Processed 96.06 million rows, 768.52 MB (199.69 million rows/s., 1.60 GB/s.)\n*/\n
Run Code Online (Sandbox Code Playgroud)\n
\n

还有其他方法可以优化JOIN

\n\n
\n

一些有用的参考:

\n

Altinity 网络研讨会:每个 ClickHouse 用户都应该知道的提示和技巧

\n

Altinity 网络研讨会:ClickHouse 查询性能的秘密

\n