Mar*_*tin 19 query scripting dynamic-sql
免责声明:作为一个只使用数据库一小部分工作时间的人,请耐心等待。(大部分时间我在工作中进行 C++ 编程,但每个奇数月我都需要在 Oracle 数据库中搜索/修复/添加一些内容。)
我一再需要编写复杂的 SQL 查询,包括临时查询和内置于应用程序的查询,其中大部分查询只是重复“代码”。
用传统的编程语言编写这种可憎的东西会让你陷入困境,但我(我)还没有找到任何像样的技术来防止 SQL 查询代码重复。
编辑:第一,我要感谢对我的原始示例提供出色改进的回答者。但是,这个问题与我的示例无关。这是关于 SQL 查询中的重复性。因此,到目前为止的答案(JackP、Leigh)都很好地表明您可以通过编写更好的查询来减少重复性。然而,即便如此,您仍面临一些显然无法消除的重复性:这总是用 SQL 烦扰我。在“传统”编程语言中,我可以进行大量重构以最大程度地减少代码中的重复性,但是对于 SQL,似乎没有(?)工具可以实现这一点,除非开始编写重复性较低的语句。
请注意,我再次删除了 Oracle 标记,因为我真的很想知道是否没有允许更多内容的数据库或脚本语言。
这是我今天拼凑起来的一颗这样的宝石。它基本上报告了单个表的一组列中的差异。请浏览以下代码,尤其是。最后的大查询。下面我继续说。
--
-- Create Table to test queries
--
CREATE TABLE TEST_ATTRIBS (
id NUMBER PRIMARY KEY,
name VARCHAR2(300) UNIQUE,
attr1 VARCHAR2(2000),
attr2 VARCHAR2(2000),
attr3 INTEGER,
attr4 NUMBER,
attr5 VARCHAR2(2000)
);
--
-- insert some test data
--
insert into TEST_ATTRIBS values ( 1, 'Alfred', 'a', 'Foobar', 33, 44, 'e');
insert into TEST_ATTRIBS values ( 2, 'Batman', 'b', 'Foobar', 66, 44, 'e');
insert into TEST_ATTRIBS values ( 3, 'Chris', 'c', 'Foobar', 99, 44, 'e');
insert into TEST_ATTRIBS values ( 4, 'Dorothee', 'd', 'Foobar', 33, 44, 'e');
insert into TEST_ATTRIBS values ( 5, 'Emilia', 'e', 'Barfoo', 66, 44, 'e');
insert into TEST_ATTRIBS values ( 6, 'Francis', 'f', 'Barfoo', 99, 44, 'e');
insert into TEST_ATTRIBS values ( 7, 'Gustav', 'g', 'Foobar', 33, 44, 'e');
insert into TEST_ATTRIBS values ( 8, 'Homer', 'h', 'Foobar', 66, 44, 'e');
insert into TEST_ATTRIBS values ( 9, 'Ingrid', 'i', 'Foobar', 99, 44, 'e');
insert into TEST_ATTRIBS values (10, 'Jason', 'j', 'Bob', 33, 44, 'e');
insert into TEST_ATTRIBS values (12, 'Konrad', 'k', 'Bob', 66, 44, 'e');
insert into TEST_ATTRIBS values (13, 'Lucas', 'l', 'Foobar', 99, 44, 'e');
insert into TEST_ATTRIBS values (14, 'DUP_Alfred', 'a', 'FOOBAR', 33, 44, 'e');
insert into TEST_ATTRIBS values (15, 'DUP_Chris', 'c', 'Foobar', 66, 44, 'e');
insert into TEST_ATTRIBS values (16, 'DUP_Dorothee', 'd', 'Foobar', 99, 44, 'e');
insert into TEST_ATTRIBS values (17, 'DUP_Gustav', 'X', 'Foobar', 33, 44, 'e');
insert into TEST_ATTRIBS values (18, 'DUP_Homer', 'h', 'Foobar', 66, 44, 'e');
insert into TEST_ATTRIBS values (19, 'DUP_Ingrid', 'Y', 'foo', 99, 44, 'e');
insert into TEST_ATTRIBS values (20, 'Martha', 'm', 'Bob', 33, 88, 'f');
-- Create comparison view
CREATE OR REPLACE VIEW TA_SELFCMP as
select
t1.id as id_1, t2.id as id_2, t1.name as name, t2.name as name_dup,
t1.attr1 as attr1_1, t1.attr2 as attr2_1, t1.attr3 as attr3_1, t1.attr4 as attr4_1, t1.attr5 as attr5_1,
t2.attr1 as attr1_2, t2.attr2 as attr2_2, t2.attr3 as attr3_2, t2.attr4 as attr4_2, t2.attr5 as attr5_2
from TEST_ATTRIBS t1, TEST_ATTRIBS t2
where t1.id <> t2.id
and t1.name <> t2.name
and t1.name = REPLACE(t2.name, 'DUP_', '')
;
-- NOTE THIS PIECE OF HORRIBLE CODE REPETITION --
-- Create comparison report
-- compare 1st attribute
select 'attr1' as Different,
id_1, id_2, name, name_dup,
CAST(attr1_1 AS VARCHAR2(2000)) as Val1, CAST(attr1_2 AS VARCHAR2(2000)) as Val2
from TA_SELFCMP
where attr1_1 <> attr1_2
or (attr1_1 is null and attr1_2 is not null)
or (attr1_1 is not null and attr1_2 is null)
union
-- compare 2nd attribute
select 'attr2' as Different,
id_1, id_2, name, name_dup,
CAST(attr2_1 AS VARCHAR2(2000)) as Val1, CAST(attr2_2 AS VARCHAR2(2000)) as Val2
from TA_SELFCMP
where attr2_1 <> attr2_2
or (attr2_1 is null and attr2_2 is not null)
or (attr2_1 is not null and attr2_2 is null)
union
-- compare 3rd attribute
select 'attr3' as Different,
id_1, id_2, name, name_dup,
CAST(attr3_1 AS VARCHAR2(2000)) as Val1, CAST(attr3_2 AS VARCHAR2(2000)) as Val2
from TA_SELFCMP
where attr3_1 <> attr3_2
or (attr3_1 is null and attr3_2 is not null)
or (attr3_1 is not null and attr3_2 is null)
union
-- compare 4th attribute
select 'attr4' as Different,
id_1, id_2, name, name_dup,
CAST(attr4_1 AS VARCHAR2(2000)) as Val1, CAST(attr4_2 AS VARCHAR2(2000)) as Val2
from TA_SELFCMP
where attr4_1 <> attr4_2
or (attr4_1 is null and attr4_2 is not null)
or (attr4_1 is not null and attr4_2 is null)
union
-- compare 5th attribute
select 'attr5' as Different,
id_1, id_2, name, name_dup,
CAST(attr5_1 AS VARCHAR2(2000)) as Val1, CAST(attr5_2 AS VARCHAR2(2000)) as Val2
from TA_SELFCMP
where attr5_1 <> attr5_2
or (attr5_1 is null and attr5_2 is not null)
or (attr5_1 is not null and attr5_2 is null)
;
Run Code Online (Sandbox Code Playgroud)
如您所见,生成“差异报告”的查询使用了 5 次相同的 SQL SELECT 块(很容易达到 42 次!)。这让我觉得绝对是脑死亡(毕竟我写了代码,我可以这么说),但我还没有找到任何好的解决方案。
如果这将是某个实际应用程序代码中的查询,我可以编写一个函数将这个查询拼凑成一个字符串,然后我将查询作为一个字符串执行。
或者,如果从 PL/SQL 或类似的东西中使用,我猜有一些程序方法可以使这个查询更易于维护。
如果这个查询需要作为数据库中的视图,那么 - 据我所知 - 除了实际维护我上面发布的视图定义之外别无他法。(!!?)
那么,正如标题所说 - 有什么技巧可以防止不得不写出这样的可憎之物?
Jac*_*las 13
你太谦虚了——鉴于你正在承担的任务,你的 SQL 写得很好而且简洁。几点提示:
t1.name <> t2.name如果t1.name = REPLACE(t2.name, 'DUP_', '')- 您可以放弃前者,则始终为真union all。union意味着union all然后删除重复项。在这种情况下它可能没有区别,但union all除非您明确想要删除任何重复项,否则始终使用是一个好习惯。如果您愿意在转换为 varchar后进行数值比较,那么以下可能值得考虑:
create view test_attribs_cast as
select id, name, attr1, attr2, cast(attr3 as varchar(2000)) as attr3,
cast(attr4 as varchar(2000)) as attr4, attr5
from test_attribs;
create view test_attribs_unpivot as
select id, name, 1 as attr#, attr1 as attr from test_attribs_cast union all
select id, name, 2, attr2 from test_attribs_cast union all
select id, name, 3, attr3 from test_attribs_cast union all
select id, name, 4, attr4 from test_attribs_cast union all
select id, name, 5, attr5 from test_attribs_cast;
select 'attr'||t1.attr# as different, t1.id as id_1, t2.id as id_2, t1.name,
t2.name as name_dup, t1.attr as val1, t2.attr as val2
from test_attribs_unpivot t1 join test_attribs_unpivot t2 on(
t1.id<>t2.id and
t1.name = replace(t2.name, 'DUP_', '') and
t1.attr#=t2.attr# )
where t1.attr<>t2.attr or (t1.attr is null and t2.attr is not null)
or (t1.attr is not null and t2.attr is null);
Run Code Online (Sandbox Code Playgroud)
第二个视图是一种unpivot操作——如果你至少使用 11g,你可以用unpivot子句更简洁地做到这一点——参见这里的例子
- 编辑 -
为了回答问题的更一般方面,有一些技术可以减少 SQL 中的重复,包括:
但是您不能直接将面向对象的想法带入 SQL 世界 - 在许多情况下,如果查询可读且编写良好,那么重复就可以了,而仅仅为了避免重复而诉诸动态 SQL(例如)是不明智的。
包括 Leigh 建议的更改和 CTE 而不是视图的最终查询可能如下所示:
with t as ( select id, name, attr#,
decode(attr#,1,attr1,2,attr2,3,attr3,4,attr4,attr5) attr
from test_attribs
cross join (select rownum attr# from dual connect by rownum<=5))
select 'attr'||t1.attr# as different, t1.id as id_1, t2.id as id_2, t1.name,
t2.name as name_dup, t1.attr as val1, t2.attr as val2
from t t1 join test_attribs_unpivot t2
on( t1.id<>t2.id and
t1.name = replace(t2.name, 'DUP_', '') and
t1.attr#=t2.attr# )
where t1.attr<>t2.attr or (t1.attr is null and t2.attr is not null)
or (t1.attr is not null and t2.attr is null);
Run Code Online (Sandbox Code Playgroud)
这是 JackPDouglas (+1)提供的 test_attribs_unpivot 视图的替代方法,该视图适用于 11g 之前的版本并且执行较少的全表扫描:
CREATE OR REPLACE VIEW test_attribs_unpivot AS
SELECT ID, Name, MyRow Attr#, CAST(
DECODE(MyRow,1,attr1,2,attr2,3,attr3,4,attr4,attr5) AS VARCHAR2(2000)) attr
FROM TEST_ATTRIBS
CROSS JOIN (SELECT level MyRow FROM dual connect by level<=5);
Run Code Online (Sandbox Code Playgroud)
他的最终查询可以在此视图中不变地使用。