我在scikit学习中创建了一个管道,
pipeline = Pipeline([
('bow', CountVectorizer()),
('classifier', BernoulliNB()),
])
Run Code Online (Sandbox Code Playgroud)
并使用交叉验证计算准确性
scores = cross_val_score(pipeline, # steps to convert raw messages into models
train_set, # training data
label_train, # training labels
cv=5, # split data randomly into 10 parts: 9 for training, 1 for scoring
scoring='accuracy', # which scoring metric?
n_jobs=-1, # -1 = use all cores = faster
)
Run Code Online (Sandbox Code Playgroud)
如何报告混淆矩阵而不是"准确性"?
我有一个由0,1,-1组成的pandas数据帧.
import pandas as pd
df=pd.DataFrame({'indicator':[0, 0, 0, -1,0,0,1,0,-1,1,0,-1,0,1]})
Run Code Online (Sandbox Code Playgroud)
我不想找到每-1和1的索引,这样-1后面跟着一些或没有零和一个+1.对于上面例子中的exapmle,我想得到
[(3,6),(8,9),(11,13)]
Run Code Online (Sandbox Code Playgroud)