我一直致力于scikit-learn SVM的二进制分类问题.我已经计算了音频文件的功能并将它们写入CSV文件.这就是CSV文件中每一行的样子:
"13_10 The Long And Winding Road " "[-6.5633095666136669e-16,-1.56E-15,-3.21E-15,-2.20E-
15,-2.52E-15,-3.04E-15,-3.39E-15,-3.47E-15,-3.07E-15,-6.02E-15,-3.00E-15,-4.77E-15,-3.05E-
15,-2.13E-15,-1.57E-15,-1.87E-15,-2.05E-15,-1.76E-15,-1.38E-15,-9.89E-16,-7.89E-16,-8.99E-
16,-1.09E-15,-7.26E-16,-8.68E-16,-4.68E-16,-2.82E-16,-1.99E-16,-1.75E-16,-2.18E-16,-1.43E-
16,-1.56E-16,-1.91E-16,-1.21E-16,-4.82E-17,-4.39E-17,-2.89E-17,-2.05E-17,0.0]" 0
Run Code Online (Sandbox Code Playgroud)
第一列具有Audio的名称,第二列具有要素数组,最后一个要素是用于二进制分类的标签{0,1}.
数组中有39个浮点值.我使用以下代码从CSV文件中提取它们.
with open('File.csv', 'rb') as csvfile:
albumreader = csv.reader(csvfile, delimiter=' ')
data = list()
for row in albumreader:
data.append(row[0:])
data = np.array(data)
X_train = list()
Y_train = list()
k = data.shape[0]
for i in range(k):
feature = data[i][1]
x = map(float, feature[1:-2].split(','))
X_train.append(x)
label = data[i][2]
y = float(label)
Y_train.append(y)
Run Code Online (Sandbox Code Playgroud)
因此,当我打印X_train和Y_train时,我得到数组中的确切值.但是当我使用时
clf = svm.SVC(C=1.0, cache_size=200,kernel='linear', max_iter=-1)
clf.fit(X_train,Y_train)
Run Code Online (Sandbox Code Playgroud)
我得到错误说
Traceback (most recent call last):
File …Run Code Online (Sandbox Code Playgroud)