UserWarning:X 没有有效的特征名称,但 LogisticRegression 已安装了特征名称

Wfe*_*fee 5 python machine-learning flask scikit-learn

我在 Flask 中编写了一个程序来获取用户的输入,以输入长度和宽度来预测鱼的类型,但是当我输入时,它会显示一个错误,称为

UserWarning: X does not have valid feature names, but LogisticRegression was fitted 
with feature names
Run Code Online (Sandbox Code Playgroud)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

df=pd.read_csv('Fish.csv')
df.head()

X = df.drop('Species', axis=1)
y = df['Species']

cols = X.columns
index = X.index

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=0)

from sklearn.ensemble import RandomForestClassifier
random=RandomForestClassifier()
random.fit(X_train,y_train)
y_pred=random.predict(X_test)

from sklearn.metrics import accuracy_score
score=accuracy_score(y_test,y_pred)

# Create a Pickle file  
import pickle
pickle_out = open("model.pkl","wb")
pickle.dump(logistic_model, pickle_out)
pickle_out.close()

logistic_model.predict([[242.0,23.2,25.4,30.0,11.5200,4.0200]])
Run Code Online (Sandbox Code Playgroud)

import numpy as np
import pickle
import pandas as pd
from flask import Flask, request, jsonify, render_template

app=Flask(__name__)
pickle_in = open("model.pkl","rb")
random = pickle.load(pickle_in)

@app.route('/')
def home():
    return render_template('index.html')


@app.route('/predict',methods=["POST"])
def predict():
    """
    For rendering results on HTML GUI
    """
    int_features = [x for x in request.form.values()]
    final_features = [np.array(int_features)]
    prediction = random.predict(final_features)
    return render_template('index.html', prediction_text = 'The fish belongs to species {}'.format(str(prediction)))

if __name__=='__main__':
    app.run()
Run Code Online (Sandbox Code Playgroud)

数据集 https://www.kaggle.com/datasets/aungpyaeap/fish-market

Abh*_*apa 6

我也面临同样的警告: UserWarning: X 没有有效的特征名称,但 LogisticRegression 配备了特征名称。

这个警告实际上是说,在将数据拟合到我们的模型时model.fit(),它dataframe X_train有属性名称,但是当您尝试使用数据帧或转换为行向量的 numpy 数组进行预测时,您没有为您想要的元组提供特征/属性名称来做预测。

为了清楚地理解我的意思,请参阅下面的示例图片:单击此处查看示例图像

希望这可以帮助初学者通过模型对看不见的数据进行预测


小智 4

你的Xandy是一个 pandas 数据框。在将其安装到随机森林分类器之前,将其设置为 numpy 数组,例如,

X = X.values
y = y.values
Run Code Online (Sandbox Code Playgroud)

之后进行训练测试分割,

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=0)
Run Code Online (Sandbox Code Playgroud)

现在拟合模型(代码与下面的相同),

from sklearn.ensemble import RandomForestClassifier
random = RandomForestClassifier()
random.fit(X_train,y_train)
y_pred=random.predict(X_test)
Run Code Online (Sandbox Code Playgroud)

在 Flask 应用程序中,您在 numpy 数组中提供输入,但在训练期间您有 pandas 数据框,这就是引发警告的原因。现在,它应该可以正常工作了!