AttributeError:'Series'对象没有属性'items'

Car*_*a S 2 python series attributeerror python-2.7 pandas

我正在尝试使用同事写的脚本.

这部分脚本工作正常:

xl = pd.ExcelFile(path + WQ_file)
sheet_names = xl.sheet_names

df = pd.read_excel(path + WQ_file, sheetname = 'Chemistry Output Table', skiprows = [0,1,2,4,5,6,7], 
               index_col = [0,1], na_values = ['', 'na', '-'])
df.index.names = ['Field_ID', 'Date_Time']

header = pd.read_excel(path + WQ_file, sheetname = 'header data',  
               index_col = [0], na_values = ['', 'na', ' - '])
header_dict = {ah: header['name_short'].loc[ah] for ah in header.index}

analytes_excel = pd.read_excel(path + WQ_file, sheetname = 'analytes', columns = 'name')
analytes_list = [item for sublist in analytes_excel.values.tolist() for item in sublist]
analytes = [header['name_short'].loc[x] for x in analytes_list]    
Run Code Online (Sandbox Code Playgroud)

但这部分不是:

# Clean up the data and report "less than" as half of the LOR
df2 = df.copy()
for col in df2.columns:
x = []
for (a, b) in df2[col].items():
    if b == " - ":
        b = np.nan
    try:
        b = float(b)
    except:
        b = float(b.strip('< '))/2
    x.append(b)
df2[col] = x
Run Code Online (Sandbox Code Playgroud)

我收到以下错误:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-80ad8c096fc0> in <module>()
  4 for col in df2.columns:
  5     x = []
 ----> 6     for (a, b) in df2[col].items():
  7         if b == " - ":
  8             b = np.nan

 C:\Users\SardellaC\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\generic.pyc in __getattr__(self, name)
 1938 
 1939         if name in self._internal_names_set:
-> 1940             return object.__getattribute__(self, name)
 1941         elif name in self._metadata:
 1942             return object.__getattribute__(self, name)

 AttributeError: 'Series' object has no attribute 'items'
Run Code Online (Sandbox Code Playgroud)

它可能与使用的不同版本的Python有关.我对Python并不熟悉,如果有人能指出我正确的方向,我会很感激.

Kat*_*mar 6

使用iteritems()而不是items()在迭代pandas系列时使用

for (a, b) in df2[col].iteritems():
    x = []
    ....
Run Code Online (Sandbox Code Playgroud)

但是,遍历每一行对于大型数据集来说是一个非常缓慢的过程.您可以通过使用.apply()函数简单地使用该部分代码.如果您需要简化代码,请告诉我.