如何解码消息的mime部分并在Python 2.7中获取**unicode**字符串?

gue*_*tli 4 python email unicode python-2.7

这是一个尝试获取电子邮件消息的html部分的方法:

from __future__ import absolute_import, division, unicode_literals, print_function

import email

html_mail_quoted_printable=b'''Subject: =?ISO-8859-1?Q?WG=3A_Wasenstra=DFe_84_in_32052_Hold_Stau?=
MIME-Version: 1.0
Content-type: multipart/mixed;
 Boundary="0__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253"

--0__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253
Content-type: multipart/alternative;
 Boundary="1__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253"

--1__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253
Content-type: text/plain; charset=ISO-8859-1
Content-transfer-encoding: quoted-printable

Freundliche Gr=FC=DFe

--1__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253
Content-type: text/html; charset=ISO-8859-1
Content-Disposition: inline
Content-transfer-encoding: quoted-printable

<html><body>
Freundliche Gr=FC=DFe
</body></html>
--1__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253--

--0__=4EBBF4C4DFD012538f9e8a93df938690918c4EBBF4C4DFD01253--

'''
def get_html_part(msg):
    for part in msg.walk():
        if part.get_content_type() == 'text/html':
            return part.get_payload(decode=True)

msg=email.message_from_string(html_mail_quoted_printable)
html=get_html_part(msg)
print(type(html))
print(html)
Run Code Online (Sandbox Code Playgroud)

输出:

<type 'str'>
<html><body>
Freundliche Gr??e
</body></html>
Run Code Online (Sandbox Code Playgroud)

不幸的是我得到一个字节串.我想要unicode字符串.

根据这个答案 msg.get_payload(decode=True)应该做的神奇.但它不是在这种情况下.

如何解码消息的mime部分并在Python 2.7中获取unicode字符串?

bob*_*nce 7

不幸的是我得到一个字节串.我想要unicode字符串.

decode=True参数get_payload仅解码Content-Transfer-Encoding包装器,即=此消息中的-encoding.从那里到角色是email包装让你自己做的许多事情之一:

bytes = part.get_payload(decode=True)
charset = part.get_content_charset('iso-8859-1')
chars = bytes.decode(charset, 'replace')
Run Code Online (Sandbox Code Playgroud)

(iso-8859-1如果消息指定没有编码,则为后备.)