iloc给出'IndexError:单个位置索引器超出范围'

Tay*_*lrl 34 python machine-learning

我正在尝试使用以下内容对某些信息进行编码以读入机器学习模型

import numpy as np
import pandas as pd
import matplotlib.pyplot as py

Dataset = pd.read_csv('filename.csv', sep = ',')

X = Dataset.iloc[:,:-1].values
Y = Dataset.iloc[:,18].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[:, 0] = labelencoder_X.fit_transform(X[:, 0])
onehotencoder = OneHotEncoder(categorical_features = [0])
X = onehotencoder.fit_transform(X).toarray()
Run Code Online (Sandbox Code Playgroud)

但是我收到了一个错误

runfile('C:/Users/name/Desktop/Machine Learning/Data preprocessing      template.py', wdir='C:/Users/taylorr2/Desktop/Machine Learning')
Traceback (most recent call last):

  File "<ipython-input-141-a5d1cd02c2df>", line 1, in <module>
    runfile('C:/Users/name/Desktop/Machine Learning/Data preprocessing  template.py', wdir='C:/Users/taylorr2/Desktop/Machine Learning')

  File "C:\Users\name\AppData\Local\Continuum\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
    execfile(filename, namespace)

  File "C:\Users\name\AppData\Local\Continuum\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 87, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "C:/Users/name/Desktop/Machine Learning/Data preprocessing template.py", line 8, in <module>
Y = Dataset.iloc[:,18].values

   File "C:\Users\name\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1310, in __getitem__
return self._getitem_tuple(key)

   File "C:\Users\name\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1560, in _getitem_tuple
self._has_valid_tuple(tup)

   File "C:\Users\name\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 151, in _has_valid_tuple
if not self._has_valid_type(k, i):

   File "C:\Users\name\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1528, in _has_valid_type
return self._is_valid_integer(key, axis)

   File "C:\Users\name\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1542, in _is_valid_integer
raise IndexError("single positional indexer is out-of-bounds")

IndexError: single positional indexer is out-of-bounds
Run Code Online (Sandbox Code Playgroud)

我在这里读到了一个关于同样错误的问题,并尝试过

import numpy as np
import pandas as pd
import matplotlib.pyplot as py

Dataset = pd.read_csv('filename.csv', sep = ',')

table = Dataset.find(id='AlerId')
rows = table.find_all('tr')[1:]
data = [[cell.text for cell in row.find_all('td')] for row in rows]
Dataset1 = pd.DataFrame(data=data, columns=columns)

X = Dataset1.iloc[:,:-1].values
Y = Dataset1.iloc[:,18].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[:, 0] = labelencoder_X.fit_transform(X[:, 0])
onehotencoder = OneHotEncoder(categorical_features = [0])
X = onehotencoder.fit_transform(X).toarray()
Run Code Online (Sandbox Code Playgroud)

但是我觉得这可能让我更加困惑,现在更多的是一个州.

有什么建议?

slo*_*tam 46

此错误由以下原因引起:

Y = Dataset.iloc[:,18].values
Run Code Online (Sandbox Code Playgroud)

索引很可能在这里超出限制,因为数据集中的列少于19列,因此第18列不存在.您提供的以下代码根本不使用Y,因此您现在可以注释掉这一行.


Nic*_*ais 18

当您索引行/列的数字大于dataframe. 例如,当您只有三列时获得第十一列。

import pandas as pd

df = pd.DataFrame({'Name': ['Mark', 'Laura', 'Adam', 'Roger', 'Anna'],
                   'City': ['Lisbon', 'Montreal', 'Lisbon', 'Berlin', 'Glasgow'],
                   'Car': ['Tesla', 'Audi', 'Porsche', 'Ford', 'Honda']})
Run Code Online (Sandbox Code Playgroud)

您有 5 行和3 列

    Name      City      Car
0   Mark    Lisbon    Tesla
1  Laura  Montreal     Audi
2   Adam    Lisbon  Porsche
3  Roger    Berlin     Ford
4   Anna   Glasgow    Honda
Run Code Online (Sandbox Code Playgroud)

让我们尝试索引第十一列(它不存在):

df.iloc[:, 10] # there is obviously no 11th column
Run Code Online (Sandbox Code Playgroud)

IndexError:单个位置索引器越界

如果您是 Python 初学者,请记住df.iloc[:, 10]将参考第十一栏。


que*_*o42 5

它对解决这里的问题没有帮助,但是无论谁可能来这里是为了错误而不是为了这个例子,IndexError: single positional indexer is out-of-bounds当我试图在 dataframe2 中查找一行,同时循环遍历 dataframe1 的行时,我遇到了这个错误,使用了很多dataframe2 过滤器中的条件,并将每个找到的行添加到新的空 dataframe3 (不要问我为什么!)。该行中的值之一在 dataframe1 和 dataframe2 中都是“nan”值。我无法再过滤也无法添加新行。

解决方案:

dataframe1.fillna("nan") # or whatever you want as a fill value
dataframe2.fillna("nan")
Run Code Online (Sandbox Code Playgroud)

并且脚本运行没有错误。