类型 str 未定义 round 方法错误

Question

类型 str 未定义 round 方法错误

Rox*_*lia 5 python arrays numpy typeerror xgboost

尝试实施 XGBoost 以确定最重要的变量时，我对数组有一些错误。

我的完整代码如下

from numpy import loadtxt
from numpy import sort
import pandas as pd
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.feature_selection import SelectFromModel


df = pd.read_csv('data.txt')
array=df.values
X= array[:,0:330]
Y = array[:,330]

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=7)


model = XGBClassifier()
model.fit(X_train, y_train)


y_pred = model.predict(X_test)
predictions = [round(value) for value in y_pred]

Run Code Online (Sandbox Code Playgroud)

我收到以下错误：

TypeError: type str doesn't define __round__ method

Run Code Online (Sandbox Code Playgroud)

我能做什么？

Answer 1

Jam*_*mes 5

您所拥有的一些标签很可能y_train实际上是字符串而不是数字。 sklearn并且xgboost不要求标签为数字。

尝试检查y_pred.

from collections import Counter

Counter([type(value) for value in y_pred])

Run Code Online (Sandbox Code Playgroud)

这是我对数字标签的意思的示例

import numpy as np
from sklearn.ensemble import GradientBoostingClassifier

# test with numeric labels
x = np.vstack([np.arange(100), np.sort(np.random.normal(10, size=100))]).T
y = np.hstack([np.zeros(50, dtype=int), np.ones(50, dtype=int)])
model = GradientBoostingClassifier()
model.fit(x,y)
model.predict([[10,7]])
# returns an array with a numeric 
array([0])

Run Code Online (Sandbox Code Playgroud)

在这里使用字符串标签（相同的x数据）

y = ['a']*50 + ['b']*50
model.fit(x,y)
model.predict([[10,7]])
# returns an array with a string label
array(['a'], dtype='<U1')

Run Code Online (Sandbox Code Playgroud)

两者都是价值标签。但是，当您尝试round在字符串变量上使用时，您会得到您所看到的错误。

round('a')

TypeError: type str doesn't define __round__ method

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，1 月前
查看次数：	30665 次
最近记录：	4 年，10 月前

类型 str 未定义 __round__ 方法错误

类型 str 未定义 round 方法错误