ISO 8859-1文件名无法解码

Question

ISO 8859-1文件名无法解码

Lar*_*sky 4 python unicode mime iso latin1

我正在使用python milter从MIME消息中提取文件,并且遇到了以这样命名的文件的问题:

=？ISO-8859-1 Q + CERTIFICADO = 5FZonificaci = F3N = 5F2010 = 2Epdf？=

我似乎无法将此名称解码为UTF.为了解决先前的ISO-8859-1问题,我开始将所有文件名传递给此函数:

def unicodeConvert(self, fname):
    normalized = False

    while normalized == False:
        try:
            fname  = unicodedata.normalize('NFKD', unicode(fname, 'utf-8')).encode('ascii', 'ignore')
            normalized = True
        except UnicodeDecodeError:
            fname = fname.decode('iso-8859-1')#.encode('utf-8')
            normalized = True
        except UnicodeError:
            fname = unicode(fname.content.strip(codecs.BOM_UTF8), 'utf-8')
            normalized = True
        except TypeError:
            fname = fname.encode('utf-8')

    return fname

Run Code Online (Sandbox Code Playgroud)

哪个工作,直到我得到这个文件名.

想法一如既往地受到赞赏.

Answer 1

Mar*_*ers 8

您的字符串使用MIME标头的Quoted-printable格式进行编码.该email.header模块为您处理:

>>> from email.header import decode_header
>>> try:
...     string_type = unicode  # Python 2
... except NameError:
...     string_type = str      # Python 3
...
>>> for part in decode_header('=?ISO-8859-1?Q?Certificado=5FZonificaci=F3n=5F2010=2Epdf?='):
...     decoded = string_type(*part)
...     print(decoded)
...
Certificado_Zonificación_2010.pdf

Run Code Online (Sandbox Code Playgroud)

归档时间：	13 年，7 月前
查看次数：	1389 次
最近记录：	7 年，2 月前