给出一个简单的CSV文件:
A,B,C
Hello,Hi,0
Hola,Bueno,1
Run Code Online (Sandbox Code Playgroud)
显然,真正的数据集远比这复杂得多,但是这个数据集再现了错误.我正在尝试为它构建一个随机的森林分类器,如下所示:
cols = ['A','B','C']
col_types = {'A': str, 'B': str, 'C': int}
test = pd.read_csv('test.csv', dtype=col_types)
train_y = test['C'] == 1
train_x = test[cols]
clf_rf = RandomForestClassifier(n_estimators=50)
clf_rf.fit(train_x, train_y)
Run Code Online (Sandbox Code Playgroud)
但是我在调用fit()时得到这个回溯:
ValueError: could not convert string to float: 'Bueno'
Run Code Online (Sandbox Code Playgroud)
scikit-learn版本是0.16.1.