我想创建一个程序来使用 TF-IDF 和 SVM 进行情感分类。在对数据进行分类之前,我必须使用分层 KFold 将数据集拆分为数据训练和测试。我使用 numpy 数组来存储文本 (X) 和标签 (Y)
但它最终出现了这个错误:
ValueError: Supported target types are: ('binary', 'multiclass'). Got 'multiclass-multioutput' instead'
Run Code Online (Sandbox Code Playgroud)
此代码在 python 3.7 上运行:
labels = []
with open(path, encoding='utf-8') as in_file:
data = csv.reader(in_file)
for line in data:
labels.append(line[1])
label_np = np.array(labels)
lp = label_np.reshape(20,20)
# lp = label_np.transpose(0)
# print(lp)
result_preprocess_np = np.array(result_preprocess)
hp = result_preprocess_np.reshape(20,20)
model = LinearSVC(multi_class='crammer_singer')
total_svm = []
total_mat_svm = np.zeros((20,20))
kf = StratifiedKFold(n_splits=3)
kf.get_n_splits(hp, lp)
for train_index, test_index in kf.split(hp, …Run Code Online (Sandbox Code Playgroud)