从Python中打开的类似文件的对象解析Mbox?

use*_*114 6 python python-3.7

这有效:

import mailbox

x = mailbox.mbox('filename.mbox')  # works
Run Code Online (Sandbox Code Playgroud)

但是如果我只有文件的打开句柄而不是文件名怎么办?

fp = open('filename.mbox', mode='rb')  # for example; there are many ways to get a file-like object
x = mailbox.mbox(fp)  # doesn't work
Run Code Online (Sandbox Code Playgroud)

问题:从字节流中打开Mbox的最佳方法(最干净,最快)是什么=打开的二进制句柄,而无需先将字节复制到命名文件中?

Håk*_*Lid 1

您可以子类化mailbox.mbox。标准库的源代码可以在 github 上找到。

该逻辑似乎主要在超类中实现_singlefileMailbox

class _singlefileMailbox(Mailbox):
    """A single-file mailbox."""

    def __init__(self, path, factory=None, create=True):
        """Initialize a single-file mailbox."""
        Mailbox.__init__(self, path, factory, create)
        try:
            f = open(self._path, 'rb+')
        except OSError as e:
            if e.errno == errno.ENOENT:
                if create:
                    f = open(self._path, 'wb+')
                else:
                    raise NoSuchMailboxError(self._path)
            elif e.errno in (errno.EACCES, errno.EROFS):
                f = open(self._path, 'rb')
            else:
                raise
        self._file = f
        self._toc = None
        self._next_key = 0
        self._pending = False       # No changes require rewriting the file.
        self._pending_sync = False  # No need to sync the file
        self._locked = False
        self._file_length = None    # Used to record mailbox size
Run Code Online (Sandbox Code Playgroud)

因此,我们可以尝试摆脱 open() 逻辑,并替换 mbox 和其他超类中的 init 代码。

class CustomMbox(mailbox.mbox):
    """A custom mbox mailbox from a file like object."""

    def __init__(self, fp, factory=None, create=True):
        """Initialize mbox mailbox from a file-like object."""

        # from `mailbox.mbox`
        self._message_factory = mailbox.mboxMessage

        # from `mailbox._singlefileMailbox`
        self._file = fp
        self._toc = None
        self._next_key = 0
        self._pending = False       # No changes require rewriting the file.
        self._pending_sync = False  # No need to sync the file
        self._locked = False
        self._file_length = None    # Used to record mailbox size

        # from `mailbox.Mailbox`
        self._factory = factory

    @property
    def _path(self):
        # If we try to use some functionality that relies on knowing 
        # the original path, raise an error.
        raise NotImplementedError('This class does not have a file path')

    def flush(self):
       """Write any pending changes to disk."""
       # _singlefileMailbox has quite complicated flush method.
       # Hopefully this will work fine.
       self._file.flush()
Run Code Online (Sandbox Code Playgroud)

这可能是一个开始。但是您可能必须定义其他方法才能获得其他邮箱类的完整功能。