Tuo*_*nen 5 sql oracle performance common-table-expression
我必须执行涉及多个深度连接和复杂谓词的相对复杂的查询,其中结果(和条件)取决于满足条件的合适条目。涉及主要和次要标准,始终应用主要标准,如果结果不令人满意,则进行次要打击。简而言之
但如果没有足够的不同文档类型或不同客户的文档,请尝试仍然满足 N 个文档的数量
我选择声明式(查询)方法而不是命令式(游标和计数器)。这就是该WITH子句的用武之地。粗略地说,通过使用多个WITH块(CTE),我声明查询(我喜欢将它们视为临时视图)来为两种文档类型声明两个不同的目标集。最后,我将UNION不同 CTE 的子集作为最终结果,并执行一些COUNT检查来限制数量。
COUNT例如,多个 CTE 相互引用,并从上下文中的多个位置引用NOT EXISTS。我是SQL新手,WITH机缘巧合之下就选择了使用它。这是 的正确用例WITH还是反模式?与以命令式方式使用游标和计数器实现相同功能相比,该解决方案的性能如何?我是否选择了错误的方法?我们正在讨论具有数百万个条目的表。
这是整个查询。请原谅,由于保密原因,我不得不隐藏这些字段。
WITH target_documents AS (
SELECT
<Necessary fields>
FROM documents l
WHERE
<Suitable document criteria>
),
target_documents_type_1 AS (
SELECT * FROM target_documents WHERE type = 1
),
target_documents_type_2 AS (
SELECT * FROM target_documents WHERE type = 2
),
target_customers AS (
SELECT
<Necessary fields>
FROM customers a
WHERE
<Suitable customer criteria>
AND
EXISTS(
SELECT 1 FROM target_documents l WHERE l.customer_id = a.customer_id
)
),
target_customers_type_1 AS (
SELECT * FROM target_customers a WHERE EXISTS(
SELECT 1 FROM target_documents_type_1 l WHERE l.customer_id = a.customer_id
)
AND ROWNUM <= (<N> / 2)
),
target_customers_type_2 AS (
SELECT * FROM target_customers a WHERE EXISTS(
SELECT 1 FROM target_documents_type_2 l WHERE l.customer_id = a.customer_id
)
AND a.customer_id NOT IN (
SELECT customer_id FROM target_customers_type_1
)
AND ROWNUM <= <N>
),
-- This is the set, which meets the the primary criteria:
-- Contains only distinct customers
-- The amount of different document types is balanced as much as possible
different_customers_set AS (
SELECT
<Necessary fields>
FROM target_customers_type_1 a -- rows 0--(<N>/2) amount
JOIN target_documents_type_1 l ON (l.customer_id = a.customer_id)
WHERE
l.create_dt = (SELECT MAX(create_dt) FROM target_documents_type_1 WHERE customer_id = l.customer_id)
UNION ALL
SELECT
<Necessary fields>
FROM target_customers_type_2 a -- rows 0--<N> amount
JOIN target_documents_type_2 l ON (l.customer_id = a.customer_id)
WHERE
l.create_dt = (SELECT MAX(create_dt) FROM target_documents_type_2 WHERE customer_id = l.customer_id) AND
ROWNUM <= <N> - (SELECT COUNT(*) FROM target_customers_type_1) -- Limit the total to max N rows
)
-- Final result: primary criteria result filled with the result of secondary criteria
SELECT * FROM different_customers_set
UNION ALL
SELECT
<Necessary fields>
FROM target_customers a
JOIN target_documents l ON (l.customer_id = a.customer_id AND l.document_id NOT IN (SELECT document_id FROM different_customers_set))
WHERE
ROWNUM <= <N> - (SELECT COUNT(1) FROM different_customers_set);
Run Code Online (Sandbox Code Playgroud)
WITH这是子句的正确用法吗?是否存在一些明显的性能问题,我应该在哪里重构?或者我应该强制完成这一切?此外,该查询本身定义了一个游标,该游标在循环中重复打开(该循环为客户定义了某些条件)。
我特别关心优化器如何处理这些WITH块。是否始终使用最有效的计划(因此与使用游标相比不会有性能损失)?
target_customers_type_2CTE 引用本身(可能应该引用target_documents_type_2)。ROWNUM应像避免瘟疫一样避免过滤和生成“随机”结果。测试和调试非常困难。