Catch异常获取UnboundLocalError

Pau*_*ang 6 python exception

我写了一个爬虫来从Q&A网站上获取信息.由于并非所有字段都始终显示在页面中,因此我使用了多个try-excepts来处理这种情况.

def answerContentExtractor( loginSession, questionLinkQueue , answerContentList) :
    while True:
        URL = questionLinkQueue.get()
        try:
            response   = loginSession.get(URL,timeout = MAX_WAIT_TIME)
            raw_data   = response.text

            #These fields must exist, or something went wrong...
            questionId = re.findall(REGEX,raw_data)[0]
            answerId   = re.findall(REGEX,raw_data)[0]
            title      = re.findall(REGEX,raw_data)[0]

        except requests.exceptions.Timeout ,IndexError:
            print >> sys.stderr, URL + " extraction error..."
            questionLinkQueue.task_done()
            continue

        try:
            questionInfo = re.findall(REGEX,raw_data)[0]
        except IndexError:
            questionInfo = ""

        try:
            answerContent = re.findall(REGEX,raw_data)[0]
        except IndexError:
            answerContent = ""

        result = {
                  'questionId'   : questionId,
                  'answerId'     : answerId,
                  'title'        : title,
                  'questionInfo' : questionInfo,
                  'answerContent': answerContent
                  }

        answerContentList.append(result)
        questionLinkQueue.task_done()
Run Code Online (Sandbox Code Playgroud)

此代码有时可能会也可能不会在运行时发出以下异常:

UnboundLocalError: local variable 'IndexError' referenced before assignment
Run Code Online (Sandbox Code Playgroud)

行号表示第二个错误发生 except IndexError:

感谢大家的建议,愿意给你应得的标记,太糟糕我只能标记一个作为正确的答案......

Ash*_*ary 6

我认为问题在于这一行:

except requests.exceptions.Timeout ,IndexError
Run Code Online (Sandbox Code Playgroud)

这相当于:

except requests.exceptions.Timeout  as IndexError:
Run Code Online (Sandbox Code Playgroud)

所以,你要分配IndexError到被捕获的异常requests.exceptions.Timeout.此代码可以重现错误:

try:
    true
except NameError, IndexError:
    print IndexError
    #name 'true' is not defined
Run Code Online (Sandbox Code Playgroud)

要捕获多个异常,请使用元组:

except (requests.exceptions.Timeout, IndexError):
Run Code Online (Sandbox Code Playgroud)

并且UnboundLocalError因为IndexError您的函数将其视为局部变量,因此在实际定义之前尝试访问其值会引发UnboundLocalError错误.

>>> 'IndexError' in answerContentExtractor.func_code.co_varnames
True
Run Code Online (Sandbox Code Playgroud)

所以,如果这一行没有在runtime(requests.exceptions.Timeout ,IndexError)执行,那么IndexError它下面使用的变量将引发UnboundLocalError.重现错误的示例代码:

def func():
    try:
        print
    except NameError, IndexError:
        pass
    try:
        [][1]
    except IndexError:
        pass
func()
#UnboundLocalError: local variable 'IndexError' referenced before assignment
Run Code Online (Sandbox Code Playgroud)