变量: IndexError:只有整数、切片 (`:`)、省略号 (`...`)、numpy.newaxis (`None`) 和整数或布尔数组是有效索引

Gio*_*nio 1 python numpy pandas scikit-learn

我正在研究支持向量回归,并尝试在训练和测试中分割数据集。当我运行我的模型时,我收到此错误:

    ---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-46-c24eb12af231> in <module>
     20 
     21 # Identify min and max values for input variables
---> 22 x_min, x_max = X_test_scaled['synth_index'].min(), X_test_scaled['synth_index'].max()
     23 y_min, y_max = X_test_scaled['ln_GDP_level'].min(), X_test_scaled['ln_GDP_level'].max()
     24 

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
Run Code Online (Sandbox Code Playgroud)

我只是尝试使用以下方法转换变量:

df['synth_index'].astype(np.int64)
df['ln_GDP_level'].astype(np.int64)
Run Code Online (Sandbox Code Playgroud)

但错误仍然存​​在。

我该如何解决?这是一段给我带来错误的代码(我与之前代码的作者交谈过,发现有一段代码丢失了!)

X=df[['synth_index','ln_GDP_level']]
y=df['median_rating'].values

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42,
                                                   test_size=0.30)
scaler = preprocessing.StandardScaler().fit(X_train)
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

model1 = LinearRegression()
model2 = SVR(kernel='rbf', C=100, epsilon=1)

# Fit the two models 
lr = model1.fit(X_test_scaled, y_test)
svr = model2.fit(X_test_scaled, y_test)

# ----------- For creating a prediction plane to be used in the visualization -----------
# Set Increments between points in a meshgrid
mesh_size = 1

# Identify min and max values for input variables
x_min, x_max = X_test_scaled['synth_index'].min(), X_test_scaled['synth_index'].max()
y_min, y_max = X_test_scaled['ln_GDP_level'].min(), X_test_scaled['ln_GDP_level'].max()

# Return evenly spaced values based on a range between min and max
xrange = np.arange(x_min, x_max, mesh_size)
yrange = np.arange(y_min, y_max, mesh_size)

# Create a meshgrid
xx, yy = np.meshgrid(xrange, yrange)

# ----------- Create a prediction plane  -----------
# Use models to create a prediction plane --- Linear Regression
pred_LR = model1.predict(np.c_[xx.ravel(), yy.ravel()])
pred_LR = pred_LR.reshape(xx.shape)

# Use models to create a prediction plane --- SVR
pred_svr = model2.predict(np.c_[xx.ravel(), yy.ravel()])
pred_svr = pred_svr.reshape(xx.shape)
Run Code Online (Sandbox Code Playgroud)

使用以下代码读取数据库(Pandas):

datafile = (r'C:\Users\gpont\PycharmProjects\pythonProject2\data\Map\databaseCDP0.csv')

df = pd.read_csv(datafile, skiprows = 0, sep=';')

df

df['median_rating'] = df['median_rating'].astype(float)
print()
  
# let's find out the data type after changing
print(df.dtypes)
  
# print dataframe. 
df
Run Code Online (Sandbox Code Playgroud)

数据库:

在此输入图像描述

数据库类型:

在此输入图像描述

小智 5

scaler.transform 的输出是一个 numpy 数组,在您的情况下它是一个二维数组,而不再是 pandas 数据框。因此,您无法使用“synth_index”作为键来访问“synth_index”列。

使用下面的代码来解决您的问题。

# Identify min and max values for input variables
x_min, x_max = X_test_scaled[:, 0].min(), X_test_scaled[:, 0].max()
y_min, y_max = X_test_scaled[:, 1].min(), X_test_scaled[:, 1].max()
Run Code Online (Sandbox Code Playgroud)