我是机器学习和scikit-learn的新手.
我的问题:
(请纠正任何类型的误解)
我有一个BIG JSON数据集,我检索它并将其存储在trainList变量中.
我预先处理它以便能够使用它.
完成后,我开始分类:
码:
我目前的变量:
trainList #It is a list with all the data of my dataset in JSON form
labelList #It is a list with all the labels of my data
Run Code Online (Sandbox Code Playgroud)
方法的大部分内容:
#I transform the data from JSON form to a numerical one
X=vec.fit_transform(trainList)
#I scale the matrix (don't know why but without it, it makes an error)
X=preprocessing.scale(X.toarray())
#I generate a KFold in order to make cross validation
kf …Run Code Online (Sandbox Code Playgroud) python classification machine-learning scikit-learn supervised-learning