UNION ALL 在以某个字符串开头的所有表上

fma*_*arm 2 snowflake-cloud-data-platform

我想将以相同名称开头的表合并为一张表。例如,假设我有一个包含表“EXT_ABVD”、“EXT_ADAD”、“EXT_AVSA”、“OTHER”的数据库,并且我想合并所有以“EXT_”开头的表,我想要的结果是

select col1 ,col2 from EXT_ABVD
union all
select col1 ,col2 from EXT_ADAD
union all
select col1 ,col2 from EXT_AVSA;
Run Code Online (Sandbox Code Playgroud)

我想定期执行此操作(例如每天),每次运行时可能会有以“EXT_”开头的新表。我不想union_all手动更新查询。

我是 Snowflake 的新手,不知道该怎么做?我可以在 Snowflake 中使用脚本吗?

Kar*_*nka 5

鉴于这些表:

CREATE TABLE TEST_DB.PUBLIC.EXT_ABVD (col1 INTEGER, col2 INTEGER);
CREATE TABLE TEST_DB.PUBLIC.EXT_ADAD (col1 INTEGER, col2 INTEGER);
CREATE TABLE TEST_DB.PUBLIC.EXT_ADAQ (col1 INTEGER, col2 INTEGER);
Run Code Online (Sandbox Code Playgroud)

可以动态创建这样的视图:

CREATE OR REPLACE VIEW TEST_DB.PUBLIC.union_view AS 
SELECT * FROM TEST_DB.PUBLIC.EXT_ABVD
 UNION ALL 
SELECT * FROM TEST_DB.PUBLIC.EXT_ADAD
 UNION ALL 
SELECT * FROM TEST_DB.PUBLIC.EXT_ADAQ
Run Code Online (Sandbox Code Playgroud)

使用此程序:

create or replace procedure TEST_DB.PUBLIC.CREATE_UNION_VEIW(TBL_PREFIX VARCHAR)
  returns VARCHAR -- return final create statement
  language javascript
  as     
  $$
    // build query to get tables from information_schema
    var get_tables_stmt = "SELECT Table_Name FROM TEST_DB.INFORMATION_SCHEMA.TABLES \
            WHERE TABLE_TYPE = 'BASE TABLE' AND TABLE_NAME LIKE '"+ TBL_PREFIX + "%';"

    var get_tables_stmt = snowflake.createStatement({sqlText:get_tables_stmt });

    // get result set containing all table names
    var tables = get_tables_stmt.execute();

    // to control if UNION ALL should be added or not
    // this could likely be handled more elegantly but i don't know JavaScript :)
    var row_count = get_tables_stmt.getRowCount();
    var rows_iterated = 0; 

    // define view name
    var create_statement = "CREATE OR REPLACE VIEW TEST_DB.PUBLIC.union_view AS \n";

    // loop over result set to build statement
    while (tables.next())  {
        rows_iterated += 1;

        // we get values from the first (and only) column in the result set
        var table_name = tables.getColumnValue(1); 

        // this will obviously fail if the column count doesnt match
        create_statement += "SELECT * FROM TEST_DB.PUBLIC." + table_name 

        // add union all to all but last row
        if (rows_iterated < row_count){
            create_statement += "\n UNION ALL \n"
        }
     }

    // create the view
    var create_statement = snowflake.createStatement( {sqlText: create_statement} );
    create_statement.execute();

    // return the create statement as text
    return create_statement.getSqlText();
  $$
  ;
Run Code Online (Sandbox Code Playgroud)

我们会这样称呼它: CALL CREATE_UNION_VIEW('EXT_A');

这只是一个基本示例,因此可能需要添加列计数、模式等的逻辑。但鉴于此,我认为您将能够弄清楚如何处理结果集、参数和语句。

编辑:请参阅此处了解如何设置每天运行程序的任务。在这种情况下,最基本的看起来像这样:

create or replace task create_union_task
  warehouse = COMPUTE_WH
  schedule = '1440 minute' -- once every day
as
  CALL CREATE_UNION_VIEW('EXT_A');
Run Code Online (Sandbox Code Playgroud)