Wfe*_*fee 5 python machine-learning flask scikit-learn
我在 Flask 中编写了一个程序来获取用户的输入,以输入长度和宽度来预测鱼的类型,但是当我输入时,它会显示一个错误,称为
UserWarning: X does not have valid feature names, but LogisticRegression was fitted
with feature names
Run Code Online (Sandbox Code Playgroud)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
df=pd.read_csv('Fish.csv')
df.head()
X = df.drop('Species', axis=1)
y = df['Species']
cols = X.columns
index = X.index
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=0)
from sklearn.ensemble import RandomForestClassifier
random=RandomForestClassifier()
random.fit(X_train,y_train)
y_pred=random.predict(X_test)
from sklearn.metrics import accuracy_score
score=accuracy_score(y_test,y_pred)
# Create a Pickle file
import pickle
pickle_out = open("model.pkl","wb")
pickle.dump(logistic_model, pickle_out)
pickle_out.close()
logistic_model.predict([[242.0,23.2,25.4,30.0,11.5200,4.0200]])
Run Code Online (Sandbox Code Playgroud)
import numpy as np
import pickle
import pandas as pd
from flask import Flask, request, jsonify, render_template
app=Flask(__name__)
pickle_in = open("model.pkl","rb")
random = pickle.load(pickle_in)
@app.route('/')
def home():
return render_template('index.html')
@app.route('/predict',methods=["POST"])
def predict():
"""
For rendering results on HTML GUI
"""
int_features = [x for x in request.form.values()]
final_features = [np.array(int_features)]
prediction = random.predict(final_features)
return render_template('index.html', prediction_text = 'The fish belongs to species {}'.format(str(prediction)))
if __name__=='__main__':
app.run()
Run Code Online (Sandbox Code Playgroud)
我也面临同样的警告: UserWarning: X 没有有效的特征名称,但 LogisticRegression 配备了特征名称。
这个警告实际上是说,在将数据拟合到我们的模型时model.fit(),它dataframe X_train有属性名称,但是当您尝试使用数据帧或转换为行向量的 numpy 数组进行预测时,您没有为您想要的元组提供特征/属性名称来做预测。
为了清楚地理解我的意思,请参阅下面的示例图片:
希望这可以帮助初学者通过模型对看不见的数据进行预测
小智 4
你的Xandy是一个 pandas 数据框。在将其安装到随机森林分类器之前,将其设置为 numpy 数组,例如,
X = X.values
y = y.values
Run Code Online (Sandbox Code Playgroud)
之后进行训练测试分割,
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=0)
Run Code Online (Sandbox Code Playgroud)
现在拟合模型(代码与下面的相同),
from sklearn.ensemble import RandomForestClassifier
random = RandomForestClassifier()
random.fit(X_train,y_train)
y_pred=random.predict(X_test)
Run Code Online (Sandbox Code Playgroud)
在 Flask 应用程序中,您在 numpy 数组中提供输入,但在训练期间您有 pandas 数据框,这就是引发警告的原因。现在,它应该可以正常工作了!