如何在dbt中实现Python UDF

lee*_*lee 5 python snowflake-cloud-data-platform dbt

我需要一些帮助来应用 python UDF 在我的 dbt 模型上运行。我成功地在 Snowflake (DWH) 中创建了一个 Python 函数,并针对表运行了它。这似乎按预期工作,但在 dbt 上实现这一点似乎很困难。一些建议/帮助/指导会让我很高兴。

这是我在雪花上创建的 python UDF

create or replace function "077"."Unity".sha3_512(str varchar)
returns varchar
language python
runtime_version = '3.8'
handler = 'hash'
as

$$
import hashlib
 
def hash(str):
    # create a sha3 hash object
    hash_sha3_512 = hashlib.new("sha3_512", str.encode())

    return hash_sha3_512.hexdigest()
$$
;
Run Code Online (Sandbox Code Playgroud)

目标是在 dbt 中创建 python 函数并将其应用到下面的模型中

{{ config(materialized = 'view') }}

WITH SEC AS(
    SELECT 
         A."AccountID" AS AccountID,
         A."AccountName" AS AccountName , 
         A."Password" AS Passwords,
 apply function here (A."Password") As SHash
    FROM {{ ref('Green', 'Account') }} A
   )

----------------VIEW RECORD------------------------------ 

SELECT * 
FROM SEC
Run Code Online (Sandbox Code Playgroud)

请问有办法做到这一点吗?谢谢

Luk*_*zda 5

假设Snowflake中已经存在UDF:

{{ config(materialized = 'view') }}

WITH SEC AS(
    SELECT 
         A."AccountID" AS AccountID,
         A."AccountName" AS AccountName , 
         A."Password" AS Passwords,
         {{target.schema}}.sha3_512(A."Password") As SHash
    FROM {{ ref('Green', 'Account') }} A
   )
SELECT * 
FROM SEC;
Run Code Online (Sandbox Code Playgroud)

该函数可以使用on-run-start创建:

on-run-start:
  - '{{ creating_udf()}}'
Run Code Online (Sandbox Code Playgroud)

和宏:

{% macro creating_udf() %}

create function if not exists {{target.schema}}.sha3_512(str varchar)
returns varchar
language python
runtime_version = '3.8'
handler = 'hash'
as

$$
import hashlib
 
def hash(str):
    # create a sha3 hash object
    hash_sha3_512 = hashlib.new("sha3_512", str.encode())

    return hash_sha3_512.hexdigest()
$$
;

{% endmacro %}
Run Code Online (Sandbox Code Playgroud)