Şev*_*man 6 python great-expectations
我正在尝试用远大的期望。
我想使用的功能是expect_compound_columns_to_be_unique. 这是代码(主代码 - 模板):
import datetime
import pandas as pd
import great_expectations as ge
import great_expectations.jupyter_ux
from great_expectations.core.batch import BatchRequest
from great_expectations.checkpoint import SimpleCheckpoint
from great_expectations.exceptions import DataContextError
context = ge.data_context.DataContext()
# Note that if you modify this batch request, you may save the new version as a .json file
# to pass in later via the --batch-request option
batch_request = {'datasource_name': 'impala_okh', 'data_connector_name': 'default_inferred_data_connector_name', 'data_asset_name': 'okh.okh_forecast_prod', 'limit': 1000}
# Feel free to change the name of your suite here. Renaming this will not remove the other one.
expectation_suite_name = "okh_forecast_prod"
try:
suite = context.get_expectation_suite(expectation_suite_name=expectation_suite_name)
print(f'Loaded ExpectationSuite "{suite.expectation_suite_name}" containing {len(suite.expectations)} expectations.')
except DataContextError:
suite = context.create_expectation_suite(expectation_suite_name=expectation_suite_name)
print(f'Created ExpectationSuite "{suite.expectation_suite_name}".')
validator = context.get_validator(
batch_request=BatchRequest(**batch_request),
expectation_suite_name=expectation_suite_name
)
column_names = [f'"{column_name}"' for column_name in validator.columns()]
print(f"Columns: {', '.join(column_names)}.")
validator.head(n_rows=5, fetch_all=False)
Run Code Online (Sandbox Code Playgroud)
使用此代码调用
validator.expect_compound_columns_to_be_unique(['column1', 'column2'])
Run Code Online (Sandbox Code Playgroud)
产生以下错误:
MetricResolutionError:在分配“名称”之前无法编译 Column 对象。
我怎么解决这个问题?
小智 0
通过编写:您正在检查由和 的validator.expect_compound_columns_to_be_unique(['column1', 'column2'])元素组成的元组是否唯一。'column1''column2'
从您收到的错误来看,它似乎'column1'并'column2'没有在您的数据中定义。您应该尝试验证数据中列的实际名称:由于您通过提供的数据仅batch_request包含列'datasource_name'、'data_connector_name'和'data_asset_name','limit'那么您应该验证这些列的子集。
例如:
validator.expect_compound_columns_to_be_unique(['datasource_name', 'data_connector_name'])。