小编nag*_*ish的帖子

dbt 中无法识别宏

{{ 
    config (
        pre_hook = before_begin("{{audit_tbl_insert(1,'stg_news_sentiment_analysis_incr') }}"),
        post_hook = after_commit("{{audit_tbl_update(1,'stg_news_sentiment_analysis_incr','dbt_development','news_sentiment_analysis') }}")
        )
}}

select rd.news_id ,rd.title, rd.description, ns.sentiment from live_crawler_output_rss.rss_data rd 
left join 
live_crawler_output_rss.news_sentiment ns 
on rd.news_id = ns.data_id limit 10000;
Run Code Online (Sandbox Code Playgroud)

这是我在 DBT 中的模型,它配置了前置和后置挂钩,它们引用宏来插入和更新审计表。

我的宏

{ % macro audit_tbl_insert (model_id_no, model_name_txt) % }

{% set run_id_value = var('run_id') %}

insert into {{audit_schema_name}}.{{audit_table_name}} (run_id, model_id, model_name, status, start_time, last_updated_at)
values 
({{run_id_value}}::bigint,{{model_id_no}}::bigint,{{model_name_txt}},'STARTED',current_timestamp,current_timestamp)

{% endmacro %}

Run Code Online (Sandbox Code Playgroud)

这是我第一次使用这个宏,我看到以下错误。

Compilation Error in model stg_news_sentiment_analysis_incr 
(models/staging/stg_news_sentiment_analysis_incr.sql)
'audit_tbl_insert' is undefined in macro run_hooks (macros/materializations/hooks.sql) 
called by macro …
Run Code Online (Sandbox Code Playgroud)

python etl dbt

4
推荐指数
1
解决办法
6809
查看次数

如何在 PySpark 中为嵌套 JSON 列创建架构?

我有一个包含多个列的镶木地板文件,其中有 2 列是 JSON/Struct,但它们的类型是字符串。可以存在任意数量的 array_elements。

\n
{\n  "addressline": [\n\n    {\n      "array_element": "F748DK\xe2\x80\x998U1P9\xe2\x80\x992ZLKXE"\n    },\n    {\n      "array_element": "\xe2\x80\x99O\xe2\x80\x99P0BQ04M-"\n    },\n    {\n      "array_element": "\xe2\x80\x99fvrvrWEM-"\n    }\n\n  ],\n  "telephone": [\n    {\n      "array_element": {\n        "locationtype": "8.PLT",\n        "countrycode": null,\n        "phonenumber": "000000000",\n        "phonetechtype": "1.PTT",\n        "countryaccesscode": null,\n        "phoneremark": null\n      }\n    }\n  ]\n}\n
Run Code Online (Sandbox Code Playgroud)\n

如何创建一个架构来处理 PySpark 中的这些列?

\n

schema json apache-spark pyspark pyspark-schema

3
推荐指数
1
解决办法
4981
查看次数

标签 统计

apache-spark ×1

dbt ×1

etl ×1

json ×1

pyspark ×1

pyspark-schema ×1

python ×1

schema ×1