这篇文章是关于LogisticRegressionCV,GridSearchCV和cross_val_score之间的区别。请考虑以下设置:
import numpy as np
from sklearn.datasets import load_digits
from sklearn.linear_model import LogisticRegression, LogisticRegressionCV
from sklearn.model_selection import train_test_split, GridSearchCV, \
StratifiedKFold, cross_val_score
from sklearn.metrics import confusion_matrix
read = load_digits()
X, y = read.data, read.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3)
Run Code Online (Sandbox Code Playgroud)
在惩罚逻辑回归中,我们需要设置控制正则化的参数C。scikit-learn中有3种通过交叉验证找到最佳C的方法。
clf = LogisticRegressionCV (Cs = 10, penalty = "l1",
solver = "saga", scoring = "f1_macro")
clf.fit(X_train, y_train)
confusion_matrix(y_test, clf.predict(X_test))
Run Code Online (Sandbox Code Playgroud)
旁注:文档指出,SAGA和LIBLINEAR是L1惩罚的唯一优化器,而SAGA对于大型数据集则更快。不幸的是,热启动仅适用于Newton-CG和LBFGS。
clf = LogisticRegression (penalty = "l1", solver = "saga", warm_start = True)
clf …Run Code Online (Sandbox Code Playgroud) python machine-learning scikit-learn cross-validation hyperparameters
我需要使用 选择数据框的一半groupby,其中每个组的大小未知并且可能因组而异。例如:
index summary participant_id
0 130599 17.0 13
1 130601 18.0 13
2 130603 16.0 13
3 130605 15.0 13
4 130607 15.0 13
5 130609 16.0 13
6 130611 17.0 13
7 130613 15.0 13
8 130615 17.0 13
9 130617 17.0 13
10 86789 12.0 14
11 86791 8.0 14
12 86793 21.0 14
13 86795 19.0 14
14 86797 20.0 14
15 86799 9.0 14
16 86801 10.0 14
20 107370 1.0 15
21 …Run Code Online (Sandbox Code Playgroud) 这个问题在stackoverflow.com/q/2391679上进行
功能的一个典型例子virtual是
class Shape
{
public:
virtual string draw() = 0;
};
class Circle : public Shape
{
public:
string draw() { return "Round"; }
};
class Rectangle : public Shape
{
public:
string draw() { return "Flat"; }
};
void print (Shape& obj)
{
cout << obj.draw();
}
Run Code Online (Sandbox Code Playgroud)
但是,我们可以auto在C++ 14中传递一个参数
class Circle
{
public:
string draw() { return "Round"; }
};
class Rectangle
{
public:
string draw() { return "Flat"; }
};
void print …Run Code Online (Sandbox Code Playgroud) 关于stackoverflow的类似问题的答案建议更改实例SVR()中的参数值,但我不知道如何处理它们。
这是我正在使用的代码:
import json
import numpy as np
from sklearn.svm import SVR
f = open('training_data.txt', 'r')
data = json.loads(f.read())
f.close()
f = open('predict_py.txt', 'r')
data1 = json.loads(f.read())
f.close()
features = []
response = []
predict = []
for row in data:
a = [
row['star_power'],
row['view_count'],
row['like_count'],
row['dislike_count'],
row['sentiment_score'],
row['holidays'],
row['clashes'],
]
features.append(a)
response.append(row['collection'])
for row in data1:
a = [
row['star_power'],
row['view_count'],
row['like_count'],
row['dislike_count'],
row['sentiment_score'],
row['holidays'],
row['clashes'],
]
predict.append(a)
X = np.array(features).astype(float)
Y = np.array(response).astype(float)
predict = np.array(predict).astype(float)
svm …Run Code Online (Sandbox Code Playgroud) python ×3
scikit-learn ×2
auto ×1
c++14 ×1
pandas ×1
polymorphism ×1
standardized ×1
svm ×1
templates ×1