{{
config (
pre_hook = before_begin("{{audit_tbl_insert(1,'stg_news_sentiment_analysis_incr') }}"),
post_hook = after_commit("{{audit_tbl_update(1,'stg_news_sentiment_analysis_incr','dbt_development','news_sentiment_analysis') }}")
)
}}
select rd.news_id ,rd.title, rd.description, ns.sentiment from live_crawler_output_rss.rss_data rd
left join
live_crawler_output_rss.news_sentiment ns
on rd.news_id = ns.data_id limit 10000;
Run Code Online (Sandbox Code Playgroud)
这是我在 DBT 中的模型,它配置了前置和后置挂钩,它们引用宏来插入和更新审计表。
我的宏
{ % macro audit_tbl_insert (model_id_no, model_name_txt) % }
{% set run_id_value = var('run_id') %}
insert into {{audit_schema_name}}.{{audit_table_name}} (run_id, model_id, model_name, status, start_time, last_updated_at)
values
({{run_id_value}}::bigint,{{model_id_no}}::bigint,{{model_name_txt}},'STARTED',current_timestamp,current_timestamp)
{% endmacro %}
Run Code Online (Sandbox Code Playgroud)
这是我第一次使用这个宏,我看到以下错误。
Compilation Error in model stg_news_sentiment_analysis_incr
(models/staging/stg_news_sentiment_analysis_incr.sql)
'audit_tbl_insert' is undefined in macro run_hooks (macros/materializations/hooks.sql)
called by macro …
Run Code Online (Sandbox Code Playgroud) 我有一个包含多个列的镶木地板文件,其中有 2 列是 JSON/Struct,但它们的类型是字符串。可以存在任意数量的 array_elements。
\n{\n "addressline": [\n\n {\n "array_element": "F748DK\xe2\x80\x998U1P9\xe2\x80\x992ZLKXE"\n },\n {\n "array_element": "\xe2\x80\x99O\xe2\x80\x99P0BQ04M-"\n },\n {\n "array_element": "\xe2\x80\x99fvrvrWEM-"\n }\n\n ],\n "telephone": [\n {\n "array_element": {\n "locationtype": "8.PLT",\n "countrycode": null,\n "phonenumber": "000000000",\n "phonetechtype": "1.PTT",\n "countryaccesscode": null,\n "phoneremark": null\n }\n }\n ]\n}\n
Run Code Online (Sandbox Code Playgroud)\n如何创建一个架构来处理 PySpark 中的这些列?
\n