自动生成的Python dbt测试似乎被硬编码失败

hot*_*oup 5 dbt

我有一个 Python DBT 项目,它定义了以下数据模型(通过 YAML):

version: 2
models:
- name: company
  description: ''
  columns:
  - name: ID
    description: Unique identifier for the company entity. It is the company name
    tests:
      - unique
      - not_null
      - dbt_expectations.expect_column_values_to_be_of_type:
          column_type: VARCHAR
  - name: REMOTE_ID
    description: Company name
    tests:
      - dbt_expectations.expect_column_values_to_be_of_type:
          column_type: VARCHAR
  - name: CREATED_AT
    description: ''
    tests:
      - dbt_expectations.expect_column_values_to_be_of_type:
          column_type: TIMESTAMP_TZ
      - not_null
  - name: UPDATED_AT
    description: ''
    tests:
      - dbt_expectations.expect_column_values_to_be_of_type:
          column_type: TIMESTAMP_TZ
Run Code Online (Sandbox Code Playgroud)

当我运行时dbt compile,然后dbt rundbt test得到 DBT 测试失败并显示以下控制台输出:

17:59:30  Failure in test dbt_expectations_expect_column_values_to_be_of_type_company_CREATED_AT__TIMESTAMP_TZ (models/fizzbuzz/domain/company.yml)
17:59:30    Got 1 result, configured to fail if != 0
17:59:30  
17:59:30    compiled Code at target/compiled/myapp/models/fizzbuzz/domain/company.yml/dbt_expectations_expect_column_837854ce21e79ee1abe42f69891c68e9.sql
Run Code Online (Sandbox Code Playgroud)

当我查看已编译的 SQL 时,dbt_expectations_expect_column_837854ce21e79ee1abe42f69891c68e9.sql我看到的是:

WITH relation_columns
     AS (SELECT Cast('ID' AS VARCHAR)      AS relation_column,
                Cast('VARCHAR' AS VARCHAR) AS relation_column_type
         UNION ALL
         SELECT Cast('REMOTE_ID' AS VARCHAR) AS relation_column,
                Cast('VARCHAR' AS VARCHAR)   AS relation_column_type
         UNION ALL
         SELECT Cast('CREATED_AT' AS VARCHAR)    AS relation_column,
                Cast('TIMESTAMP_NTZ' AS VARCHAR) AS relation_column_type
         UNION ALL
         SELECT Cast('UPDATED_AT' AS VARCHAR)    AS relation_column,
                Cast('TIMESTAMP_NTZ' AS VARCHAR) AS relation_column_type),
     test_data
     AS (SELECT *
         FROM   relation_columns
         WHERE  relation_column = 'CREATED_AT'
                AND relation_column_type NOT IN ( 'TIMESTAMP_TZ' ))
SELECT *
FROM   test_data 
Run Code Online (Sandbox Code Playgroud)

所以我的理解是,当我运行时,dbt compile它会根据编译的 YAML 文件定义生成上面的测试。当我随后运行 时dbt test,它会运行此 SQL 来确定该 YAML 中定义的测试是否通过或失败。

如果这是正确的,那么 DBT 是否存在错误?是我还是这不是一个被硬编码总是失败的假阴性测试?!

它正在创建一个名为的临时表/CTE,test_data它绝不以任何方式、形状或形式连接到我的company数据库表,但有 2 列(relation_columnrelation_column_type)。然后它给这个表一行,其relation_column值为“CREATED_AT”,其relation_column_type值为“TIMESTAMP_NTZ”。最后它返回任何relation_column值为“CREATED_AT”但其relation_column_type不是“TIMESTAMP_NTZ”的记录......所以这不会总是失败吗?!

或者我是否遗漏了有关 DBT 及其测试如何工作的重要信息?感谢您的任何澄清。

hot*_*oup -1

调查发现,DBT 社区不再支持该dbt test命令,也不再支持 DBT 期望分支项目,可能就是出于这样的原因。