必须始终传递“Layer.call”的第一个参数

Sid*_*ava 5 python machine-learning keras tensorflow2.0

import os
from pylab import rcParams
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns; sns.set()
from numpy import *
from scipy import stats
from pandas.plotting import scatter_matrix
import sklearn
import warnings
from imblearn.over_sampling import SMOTE
import tensorflow as tf
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import GridSearchCV
from imblearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression


data = pd.read_excel(r'Attrition Data Exercise.xlsx')
X = data.iloc[:, 3:-1].values
y = data.iloc[:, -1].values

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import OrdinalEncoder
ct = ColumnTransformer(transformers=
                       [('one_encoder', OneHotEncoder(), [2, 5, 11, 13, 28]),
                       ('ord_encoder', OrdinalEncoder(), [0])],
                       remainder='passthrough')
X = np.array(ct.fit_transform(X))

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dropout(rate=0.3))
ann.add(tf.keras.layers.Dense(units=6, activation='relu', kernel_regularizer='l1', bias_regularizer='l2'))
ann.add(tf.keras.layers.Dropout(rate=0.3))
ann.add(tf.keras.layers.Dense(units=3, activation='relu', kernel_regularizer='l1', bias_regularizer='l2'))
ann.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))
opt = tf.keras.optimizers.Adam(
    learning_rate=0.001,
    beta_1=0.9,
    beta_2=0.999,
    epsilon=1e-08)
ann.compile(optimizer = opt, loss = 'binary_crossentropy', metrics = ['accuracy', tf.keras.metrics.Recall()])
Run Code Online (Sandbox Code Playgroud)

上面的代码运行成功。当我在单元格中运行以下代码时,它会导致错误。

pipe = Pipeline([('smt', SMOTE()), ('model', KerasClassifier(build_fn = ann, verbose = 0, epochs=170))])
weights = np.linspace(0.5, 0.5, 1)
gsc = GridSearchCV(
estimator = pipe,
param_grid = {
    'smt__sampling_strategy' : weights
},
scoring = 'f1',
cv = 4)
grid_result = gsc.fit(X_train, y_train)
Run Code Online (Sandbox Code Playgroud)

上面的代码会导致以下错误:

ValueError: The first argument to `Layer.call` must always be passed
Run Code Online (Sandbox Code Playgroud)

知道我可能做错了什么或者可以改进什么吗?我也尝试用 KerasRegressor 替换 KerasClassifier 只是为了看看是否发生了变化,但什么也没发生。本质上出了什么问题?

我正在尝试使用 imblearn 和 GridSearchCV 中的 Pipeline 类来获取对不平衡数据集进行分类的最佳参数,我想省略验证集的重新采样,而只对训练集重新采样,而 imblearn 的 Pipeline 似乎正在这样做。但是,我在实施接受的解决方案时遇到错误

还附有错误跟踪的屏幕截图链接。错误跟踪完成

小智 5

@danr 说得对。非常感谢他。当我将 KerasClassifier 与 sklearn 的 cross_val_score 一起使用时,我遇到了同样的错误。在 build_fn 之后添加 lambda 解决了问题。我有一个函数create_model创建了一个 keras 顺序模型。更正运行流畅的代码(tensorflow 2.4.1):

from sklearn.model_selection import cross_val_score

# Create a KerasClassifier using best params determined using RandomizedSearchCV above
model = KerasClassifier(build_fn = lambda: create_model(learning_rate = 0.01, activation = 'tanh'), epochs = 50, batch_size = 32, verbose = 0)

# Calculate the accuracy score for each fold
kfolds = cross_val_score(model, X, y, cv = 3)
Run Code Online (Sandbox Code Playgroud)