mutt 侧边栏中的希腊字符（邮箱名称来自 offlineimap）

Question

mutt 侧边栏中的希腊字符（邮箱名称来自 offlineimap）

Nik*_*ris 6 character-encoding mutt imap

Mutt的侧边栏无法正确读取希腊字符（使用希腊字符命名的邮箱）。无论如何，在 index 和 pager内部都没有这样的问题，其中希腊字符/单词/名称看起来很好。

在此处输入图片说明

吉尔斯评论后更新

有问题的设置，用于两个不同的系统（工作站和笔记本电脑，都运行 Funtoo 和 GNU bash，版本 4.2.45(1)-release），包括用于从/向 IMAP 服务器读取/写入电子邮件的 mutt . 的回应locale是

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=POSIX
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Run Code Online (Sandbox Code Playgroud)

并且在任何 mutt 的配置文件中都没有设置语言环境变量。

消息实际上是通过offlineimap同步并使用postfix发送的。offlineimap，将邮箱（名称）列表记录在一个文件（即名为的文件mailboxes）中，该文件源自 mutt 的配置文件之一（通过指示source ~/.mutt/mailboxes）。查看mailboxes文件的内容，发现希腊名字已经被“误解”了。

在任何情况下，希腊名字通过 webmail 客户端（RoundCube）看起来很好，它也访问相同的 IMAP 服务器，因此，访问可疑邮箱。

问题

为什么会发生这种情况？
是offlineimap配置错误的问题吗？
如何解决？

剩下的问题？[2015 年 3 月]（另见下面的回答）

但是，本地存储库文件夹名称（目录名称）仍处于不可读状态。即，上面显示的希腊文件夹名称（?????????）实际上是目录&A6UDwAO,A8QDwQO,A8YDrwOx-. 这是否意味着文件夹名称转换发生在文件夹名称和消息同步之后，而不是在同步过程中？ 或者，是否需要删除这些目录（从本地存储库中）并通过 offlineimap 强制进行另一次同步（注意不要让相应的邮箱文件夹从远程存储库中删除）？

Answer 1

Nik*_*ris 5

总之，

“问题”源于这样一个事实，即IMAP4 使用修改后的 UTF-7 编码对文件夹名称进行编码。在创建本地存储库（例如，UTF-8）之前，offlineimap不会将文件夹名称转换为可读的名称。反过来，这会派生出不可读的文件夹名称，例如此问题的屏幕截图中显示的文件夹名称。因此，这既不是Mutt也不是offlineimap本身的错误处理或错误配置。

在以下博客文章和 git 存储库中详细讨论并解决了该问题：

解决方案

本质上，一个有助于导出可读文件夹名称的python 脚本（下面提供）被输入到 offlineimap 的配置文件中（即offlineimaprc，如OfflineIMAP 手册中所述）。此外，用于正确文件夹名称转换的指导性代码行（使用 python 脚本中定义的函数），即，

# Name translation from UTF7 to UTF8
nametrans = lambda foldername: foldername.decode('imap4-utf-7').encode('utf-8')

Run Code Online (Sandbox Code Playgroud)

在离线地图的配置文件中添加了远程存储库选项的部分下。

更新，（2015 年 4 月）

反向操作需要另一条规则，请参阅文件夹过滤和名称转换。此说明可能类似于

# Name translation, reverse!
nametrans = lambda foldername: foldername.decode('utf-8').encode('imap4-utf-7')

Run Code Online (Sandbox Code Playgroud)

在mailboxes由 offlineimap 创建的文件中，希腊名称正确显示。这解决了 mutt 中的问题，文件夹名称显示为意图（在这种情况下，希腊名称）。在此处输入图片说明

剩下的问题？

但是，本地存储库文件夹名称（目录名称）仍处于不可读状态。即，上面显示的希腊文件夹名称（?????????）实际上是目录&A6UDwAO,A8QDwQO,A8YDrwOx-. 这是否意味着文件夹名称转换发生在文件夹名称和消息同步之后，而不是在同步过程中？ 或者，是否需要删除这些目录（从本地存储库中）并通过 offlineimap 强制进行另一次同步（注意不要让相应的邮箱文件夹从远程存储库中删除）？

处理国际邮箱名称的 Python 脚本（IMAP、UTF-7）：

# vim:fileencoding=utf-8

r"""
Imap folder names are encoded using a special version of utf-7 as defined in RFC
2060 section 5.1.3.

From: http://piao-tech.blogspot.com/2010/03/get-offlineimap-working-with-non-ascii.html

5.1.3.  Mailbox International Naming Convention

   By convention, international mailbox names are specified using a
   modified version of the UTF-7 encoding described in [UTF-7].  The
   purpose of these modifications is to correct the following problems
   with UTF-7:

    1) UTF-7 uses the "+" character for shifting; this conflicts with
     the common use of "+" in mailbox names, in particular USENET
     newsgroup names.

    2) UTF-7's encoding is BASE64 which uses the "/" character; this
     conflicts with the use of "/" as a popular hierarchy delimiter.

    3) UTF-7 prohibits the unencoded usage of "\"; this conflicts with
     the use of "\" as a popular hierarchy delimiter.

    4) UTF-7 prohibits the unencoded usage of "~"; this conflicts with
     the use of "~" in some servers as a home directory indicator.

    5) UTF-7 permits multiple alternate forms to represent the same
     string; in particular, printable US-ASCII chararacters can be
     represented in encoded form.

   In modified UTF-7, printable US-ASCII characters except for "&"
   represent themselves; that is, characters with octet values 0x20-0x25
   and 0x27-0x7e.  The character "&" (0x26) is represented by the two-
   octet sequence "&-".

   All other characters (octet values 0x00-0x1f, 0x7f-0xff, and all
   Unicode 16-bit octets) are represented in modified BASE64, with a
   further modification from [UTF-7] that "," is used instead of "/".
   Modified BASE64 MUST NOT be used to represent any printing US-ASCII
   character which can represent itself.

   "&" is used to shift to modified BASE64 and "-" to shift back to US-
   ASCII.  All names start in US-ASCII, and MUST end in US-ASCII (that
   is, a name that ends with a Unicode 16-bit octet MUST end with a "-
   ").

    For example, here is a mailbox name which mixes English, Japanese,
    and Chinese text: ~peter/mail/&ZeVnLIqe-/&U,BTFw-

"""

import binascii
import codecs

# encoding
def modified_base64(s):
  s = s.encode('utf-16be')
  return binascii.b2a_base64(s).rstrip(b'\n=').replace(b'/', b',').decode('ascii')

def doB64(_in, r):
  if _in:
    r.append('&%s-' % modified_base64(''.join(_in)))
    del _in[:]

def encoder(s):
  r = []
  _in = []
  for c in s:
    ordC = ord(c)
    if 0x20 <= ordC <= 0x25 or 0x27 <= ordC <= 0x7e:
      doB64(_in, r)
      r.append(c)
    elif c == '&':
      doB64(_in, r)
      r.append('&-')
    else:
      _in.append(c)
  doB64(_in, r)
  return (''.join(r).encode('ascii'), len(s))

# decoding
def modified_unbase64(s):
  b = binascii.a2b_base64(s.replace(b',', b'/') + b'===')
  return str(b, 'utf-16be')

def decoder(s):
  r = []
  decode = []
  for c in s:
    if c == b'&' and not decode:
      decode.append(b'&')
    elif c == b'-' and decode:
      if len(decode) == 1:
        r.append('&')
      else:
        r.append(modified_unbase64(b''.join(decode[1:])))
      decode = []
    elif decode:
      decode.append(c)
    else:
      r.append(c.decode('ascii'))
  if decode:
    r.append(modified_unbase64(b''.join(decode[1:])))
  bin_str = ''.join(r)
  return (bin_str, len(s))

class StreamReader(codecs.StreamReader):
  def decode(self, s, errors='strict'):
    return decoder(s)

class StreamWriter(codecs.StreamWriter):
  def decode(self, s, errors='strict'):
    return encoder(s)

def imap4_utf_7(name):
  if name == 'imap4-utf-7':
    return (encoder, decoder, StreamReader, StreamWriter)

codecs.register(imap4_utf_7)

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年，2 月前
查看次数：	1196 次
最近记录：	8 年，3 月前