我想并排查看训练数据和测试数据的损耗曲线.目前,使用clf.loss_curve(参见下文)获得每次迭代的训练集损失似乎很简单.
from sklearn.neural_network import MLPClassifier
clf = MLPClassifier()
clf.fit(X,y)
clf.loss_curve_ # this seems to have loss for the training set
Run Code Online (Sandbox Code Playgroud)
但是,我还想在测试数据集上绘制性能.这可用吗?
我想知道如何使用 sklearn 运行多类、多标签、序数分类。我想预测目标群体的排名,范围从某一位置最普遍的群体 (1) 到最不普遍的群体 (7)。我似乎无法正确处理。你能帮我一下吗?
# Random Forest Classification
# Import
import numpy as np
import pandas as pd
from sklearn.model_selection import GridSearchCV, cross_val_score, train_test_split
from sklearn.metrics import make_scorer, accuracy_score, confusion_matrix, f1_score
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
# Import dataset
dataset = pd.read_excel('alle_probs_edit.v2.xlsx')
X = dataset.iloc[:,4:-1].values
Y = dataset.iloc[:,-1].values
# Split in Train and Test
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2, random_state = 42 )
# Scaling the features (alle Variablen auf …Run Code Online (Sandbox Code Playgroud) python ordinal scikit-learn multilabel-classification multiclass-classification