我有一个 Python DBT 项目,它定义了以下数据模型(通过 YAML):
version: 2
models:
- name: company
description: ''
columns:
- name: ID
description: Unique identifier for the company entity. It is the company name
tests:
- unique
- not_null
- dbt_expectations.expect_column_values_to_be_of_type:
column_type: VARCHAR
- name: REMOTE_ID
description: Company name
tests:
- dbt_expectations.expect_column_values_to_be_of_type:
column_type: VARCHAR
- name: CREATED_AT
description: ''
tests:
- dbt_expectations.expect_column_values_to_be_of_type:
column_type: TIMESTAMP_TZ
- not_null
- name: UPDATED_AT
description: ''
tests:
- dbt_expectations.expect_column_values_to_be_of_type:
column_type: TIMESTAMP_TZ
Run Code Online (Sandbox Code Playgroud)
当我运行时dbt compile,然后dbt run我dbt test得到 DBT 测试失败并显示以下控制台输出:
17:59:30 Failure in test dbt_expectations_expect_column_values_to_be_of_type_company_CREATED_AT__TIMESTAMP_TZ (models/fizzbuzz/domain/company.yml)
17:59:30 Got 1 result, configured to fail if != 0
17:59:30
17:59:30 compiled Code at target/compiled/myapp/models/fizzbuzz/domain/company.yml/dbt_expectations_expect_column_837854ce21e79ee1abe42f69891c68e9.sql
Run Code Online (Sandbox Code Playgroud)
当我查看已编译的 SQL 时,dbt_expectations_expect_column_837854ce21e79ee1abe42f69891c68e9.sql我看到的是:
WITH relation_columns
AS (SELECT Cast('ID' AS VARCHAR) AS relation_column,
Cast('VARCHAR' AS VARCHAR) AS relation_column_type
UNION ALL
SELECT Cast('REMOTE_ID' AS VARCHAR) AS relation_column,
Cast('VARCHAR' AS VARCHAR) AS relation_column_type
UNION ALL
SELECT Cast('CREATED_AT' AS VARCHAR) AS relation_column,
Cast('TIMESTAMP_NTZ' AS VARCHAR) AS relation_column_type
UNION ALL
SELECT Cast('UPDATED_AT' AS VARCHAR) AS relation_column,
Cast('TIMESTAMP_NTZ' AS VARCHAR) AS relation_column_type),
test_data
AS (SELECT *
FROM relation_columns
WHERE relation_column = 'CREATED_AT'
AND relation_column_type NOT IN ( 'TIMESTAMP_TZ' ))
SELECT *
FROM test_data
Run Code Online (Sandbox Code Playgroud)
所以我的理解是,当我运行时,dbt compile它会根据编译的 YAML 文件定义生成上面的测试。当我随后运行 时dbt test,它会运行此 SQL 来确定该 YAML 中定义的测试是否通过或失败。
如果这是正确的,那么 DBT 是否存在错误?是我还是这不是一个被硬编码总是失败的假阴性测试?!
它正在创建一个名为的临时表/CTE,test_data它绝不以任何方式、形状或形式连接到我的company数据库表,但有 2 列(relation_column和relation_column_type)。然后它给这个表一行,其relation_column值为“CREATED_AT”,其relation_column_type值为“TIMESTAMP_NTZ”。最后它返回任何relation_column值为“CREATED_AT”但其relation_column_type值不是“TIMESTAMP_NTZ”的记录......所以这不会总是失败吗?!
或者我是否遗漏了有关 DBT 及其测试如何工作的重要信息?感谢您的任何澄清。
| 归档时间: |
|
| 查看次数: |
375 次 |
| 最近记录: |