我正在使用此功能来解析电子邮件.我能够解析"简单"的多部分电子邮件,但是当电子邮件定义多个边界(子部分)时,它会产生错误(UnboundLocalError:赋值前引用的局部变量'html').我想脚本分开文本和html部分,只返回html部分(除非没有html部分,返回文本).
def get_text(msg):
text = ""
if msg.is_multipart():
for part in msg.get_payload():
if part.get_content_charset() is None:
charset = chardet.detect(str(part))['encoding']
else:
charset = part.get_content_charset()
if part.get_content_type() == 'text/plain':
text = unicode(part.get_payload(decode=True),str(charset),"ignore").encode('utf8','replace')
if part.get_content_type() == 'text/html':
html = unicode(part.get_payload(decode=True),str(charset),"ignore").encode('utf8','replace')
if html is None:
return text.strip()
else:
return html.strip()
else:
text = unicode(msg.get_payload(decode=True),msg.get_content_charset(),'ignore').encode('utf8','replace')
return text.strip()
Run Code Online (Sandbox Code Playgroud)
Like the comment said you always check html but only declare it in one of the specific cases. Thats what the error is telling you, you reference html before assigning it. In python it is not valid to check if something is None if it hasn't been assigned to anything. For example open the python interactive prompt:
>>> if y is None:
... print 'none'
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'y' is not defined
Run Code Online (Sandbox Code Playgroud)
As you can see you cannot merely check for none to see if a variable exists. Back to your specific case.
您需要先将html设置为None,然后稍后您将检查它是否仍为None.即编辑你的代码:
def get_text(msg):
text = ""
if msg.is_multipart():
html = None
for part in msg.get_payload():
if part.get_content_charset() is None:
charset = chardet.detect(str(part))['encoding']
else:
charset = part.get_content_charset()
if part.get_content_type() == 'text/plain':
text = unicode(part.get_payload(decode=True),str(charset),"ignore").encode('utf8','replace')
if part.get_content_type() == 'text/html':
html = unicode(part.get_payload(decode=True),str(charset),"ignore").encode('utf8','replace')
if html is None:
return text.strip()
else:
return html.strip()
else:
text = unicode(msg.get_payload(decode=True),msg.get_content_charset(),'ignore').encode('utf8','replace')
return text.strip()
Run Code Online (Sandbox Code Playgroud)
这解释了一点:http: //code.activestate.com/recipes/59892-testing-if-a-variable-is-defined/
| 归档时间: |
|
| 查看次数: |
5497 次 |
| 最近记录: |