如何使用MimeKit获取电子邮件的WYSIWYG正文

Question

如何使用MimeKit获取电子邮件的WYSIWYG正文

我正在使用一个名为EAgetmail的库来检索指定电子邮件的正文,它运行良好,但我现在正在使用Mailkit.问题在于EAgetmail等同于message.body返回主体,因为用户在电子邮件客户端中看到它,但在mailkit中它返回了许多不同的数据.

这是相关代码:

using (var client = new ImapClient())
{
    client.Connect(emailServer, 993, true);
    client.AuthenticationMechanisms.Remove("XOAUTH2");
    client.Authenticate(username, password);
    var inbox = client.Inbox;
    inbox.Open(FolderAccess.ReadOnly);
    SearchQuery query;
    if (checkBox.IsChecked == false)
    {
        query = SearchQuery.DeliveredBefore((DateTime)dateEnd).And(
            SearchQuery.DeliveredAfter((DateTime)dateStart)).And(
            SearchQuery.SubjectContains("Subject to find"));
    }
    else
    {
        query = SearchQuery.SubjectContains("Subject to find");
    }
    foreach (var uid in inbox.Search(query))
    {
        var message = inbox.GetMessage(uid);
        formEmails.Add(message.TextBody);
        messageDate.Add(message.Date.LocalDateTime);
    }
    client.Disconnect(true);
}

Run Code Online (Sandbox Code Playgroud)

我也尝试了message.Body.ToString()并在消息部分中搜索纯文本,但都没有工作.我的问题是如何使用Mailkit复制EAgetmail的.body属性的效果(以纯文本形式返回正文内容,如用户所见)？

Answer 1

jst*_*ast 12

对电子邮件的一个常见误解是,有一个明确定义的邮件正文,然后是一个附件列表.实际情况并非如此.实际情况是MIME是内容的树结构,很像文件系统.

幸运的是,MIME确实定义了一组关于邮件客户端应该如何解释MIME部分的树结构的一般规则.所述Content-Disposition报头是为了提供提示,接收客户端关于哪个部件是指将被显示作为消息体的一部分,并且这意味着可以被解释为附件.

的Content-Disposition报头通常具有两个值之一:inline或attachment.

这些值的含义应该相当明显.如果值是attachment,那么所述MIME部分的内容意味着被呈现为与核心消息分开的文件附件.但是,如果值为inline,则该MIME部分的内容应在邮件客户端呈现核心邮件正文中内联显示.如果Content-Disposition标头不存在,则应将其视为值inline.

从技术上讲,缺少Content-Disposition标题或标记为的每个部分inline都是核心消息体的一部分.

不过,还有更多的东西.

现代MIME消息通常包含multipart/alternativeMIME容器,该容器通常包含发件人编写的文本text/plain和text/html版本.该text/html版本的格式通常更接近哪方以他或她的所见即所得的编辑器比看到text/plain的版本.

以两种格式发送消息文本的原因是并非所有邮件客户端都能够显示HTML.

接收客户端应仅显示容器中multipart/alternative包含的备用视图之一.由于备选视图按照最忠实于最忠实的顺序列出发件人在其所见即所得编辑器中看到的内容,因此接收客户端应该从最后开始的备用视图列表中走过并向后工作,直到找到它的一部分为止能够显示.

例:

multipart/alternative
  text/plain
  text/html

Run Code Online (Sandbox Code Playgroud)

如上例所示,该text/html部件最后列出,因为它最忠实于发件人在编写消息时在其所见即所得编辑器中看到的内容.

为了使问题更加复杂,有时现代邮件客户端将使用multipart/relatedMIME容器而不是简单text/html部分,以便在HTML中嵌入图像和其他多媒体内容.

例:

multipart/alternative
  text/plain
  multipart/related
    text/html
    image/jpeg
    video/mp4
    image/png

Run Code Online (Sandbox Code Playgroud)

在上面的示例中,其中一个备用视图是一个multipart/related容器,其中包含引用兄弟视频和图像的消息体的HTML版本.

现在您已经大致了解了消息的结构以及如何解释各种MIME实体,我们可以开始弄清楚如何按预期实际呈现消息.

使用MimeVisitor(呈现消息的最准确方式)

MimeKit包含一个MimeVisitor用于访问MIME树结构中每个节点的类.例如,以下MimeVisitor子类可用于生成由浏览器控件(例如WebBrowser)呈现的HTML :

/// <summary>
/// Visits a MimeMessage and generates HTML suitable to be rendered by a browser control.
/// </summary>
class HtmlPreviewVisitor : MimeVisitor
{
    List<MultipartRelated> stack = new List<MultipartRelated> ();
    List<MimeEntity> attachments = new List<MimeEntity> ();
    readonly string tempDir;
    string body;

    /// <summary>
    /// Creates a new HtmlPreviewVisitor.
    /// </summary>
    /// <param name="tempDirectory">A temporary directory used for storing image files.</param>
    public HtmlPreviewVisitor (string tempDirectory)
    {
        tempDir = tempDirectory;
    }

    /// <summary>
    /// The list of attachments that were in the MimeMessage.
    /// </summary>
    public IList<MimeEntity> Attachments {
        get { return attachments; }
    }

    /// <summary>
    /// The HTML string that can be set on the BrowserControl.
    /// </summary>
    public string HtmlBody {
        get { return body ?? string.Empty; }
    }

    protected override void VisitMultipartAlternative (MultipartAlternative alternative)
    {
        // walk the multipart/alternative children backwards from greatest level of faithfulness to the least faithful
        for (int i = alternative.Count - 1; i >= 0 && body == null; i--)
            alternative[i].Accept (this);
    }

    protected override void VisitMultipartRelated (MultipartRelated related)
    {
        var root = related.Root;

        // push this multipart/related onto our stack
        stack.Add (related);

        // visit the root document
        root.Accept (this);

        // pop this multipart/related off our stack
        stack.RemoveAt (stack.Count - 1);
    }

    // look up the image based on the img src url within our multipart/related stack
    bool TryGetImage (string url, out MimePart image)
    {
        UriKind kind;
        int index;
        Uri uri;

        if (Uri.IsWellFormedUriString (url, UriKind.Absolute))
            kind = UriKind.Absolute;
        else if (Uri.IsWellFormedUriString (url, UriKind.Relative))
            kind = UriKind.Relative;
        else
            kind = UriKind.RelativeOrAbsolute;

        try {
            uri = new Uri (url, kind);
        } catch {
            image = null;
            return false;
        }

        for (int i = stack.Count - 1; i >= 0; i--) {
            if ((index = stack[i].IndexOf (uri)) == -1)
                continue;

            image = stack[i][index] as MimePart;
            return image != null;
        }

        image = null;

        return false;
    }

    // Save the image to our temp directory and return a "file://" url suitable for
    // the browser control to load.
    // Note: if you'd rather embed the image data into the HTML, you can construct a
    // "data:" url instead.
    string SaveImage (MimePart image, string url)
    {
        string fileName = url.Replace (':', '_').Replace ('\\', '_').Replace ('/', '_');

        string path = Path.Combine (tempDir, fileName);

        if (!File.Exists (path)) {
            using (var output = File.Create (path))
                image.ContentObject.DecodeTo (output);
        }

        return "file://" + path.Replace ('\\', '/');
    }

    // Replaces <img src=...> urls that refer to images embedded within the message with
    // "file://" urls that the browser control will actually be able to load.
    void HtmlTagCallback (HtmlTagContext ctx, HtmlWriter htmlWriter)
    {
        if (ctx.TagId == HtmlTagId.Image && !ctx.IsEndTag && stack.Count > 0) {
            ctx.WriteTag (htmlWriter, false);

            // replace the src attribute with a file:// URL
            foreach (var attribute in ctx.Attributes) {
                if (attribute.Id == HtmlAttributeId.Src) {
                    MimePart image;
                    string url;

                    if (!TryGetImage (attribute.Value, out image)) {
                        htmlWriter.WriteAttribute (attribute);
                        continue;
                    }

                    url = SaveImage (image, attribute.Value);

                    htmlWriter.WriteAttributeName (attribute.Name);
                    htmlWriter.WriteAttributeValue (url);
                } else {
                    htmlWriter.WriteAttribute (attribute);
                }
            }
        } else if (ctx.TagId == HtmlTagId.Body && !ctx.IsEndTag) {
            ctx.WriteTag (htmlWriter, false);

            // add and/or replace oncontextmenu="return false;"
            foreach (var attribute in ctx.Attributes) {
                if (attribute.Name.ToLowerInvariant () == "oncontextmenu")
                    continue;

                htmlWriter.WriteAttribute (attribute);
            }

            htmlWriter.WriteAttribute ("oncontextmenu", "return false;");
        } else {
            // pass the tag through to the output
            ctx.WriteTag (htmlWriter, true);
        }
    }

    protected override void VisitTextPart (TextPart entity)
    {
        TextConverter converter;

        if (body != null) {
            // since we've already found the body, treat this as an attachment
            attachments.Add (entity);
            return;
        }

        if (entity.IsHtml) {
            converter = new HtmlToHtml {
                HtmlTagCallback = HtmlTagCallback
            };
        } else if (entity.IsFlowed) {
            var flowed = new FlowedToHtml ();
            string delsp;

            if (entity.ContentType.Parameters.TryGetValue ("delsp", out delsp))
                flowed.DeleteSpace = delsp.ToLowerInvariant () == "yes";

            converter = flowed;
        } else {
            converter = new TextToHtml ();
        }

        body = converter.Convert (entity.Text);
    }

    protected override void VisitTnefPart (TnefPart entity)
    {
        // extract any attachments in the MS-TNEF part
        attachments.AddRange (entity.ExtractAttachments ());
    }

    protected override void VisitMessagePart (MessagePart entity)
    {
        // treat message/rfc822 parts as attachments
        attachments.Add (entity);
    }

    protected override void VisitMimePart (MimePart entity)
    {
        // realistically, if we've gotten this far, then we can treat this as an attachment
        // even if the IsAttachment property is false.
        attachments.Add (entity);
    }
}

Run Code Online (Sandbox Code Playgroud)

您使用此访问者的方式可能如下所示:

void Render (MimeMessage message)
{
    var tmpDir = Path.Combine (Path.GetTempPath (), message.MessageId);
    var visitor = new HtmlPreviewVisitor (tmpDir);

    Directory.CreateDirectory (tmpDir);

    message.Accept (visitor);

    DisplayHtml (visitor.HtmlBody);
    DisplayAttachments (visitor.Attachments);
}

Run Code Online (Sandbox Code Playgroud)

使用TextBody和HtmlBody属性(最简单的方法)

为了简化获取消息文本的常见任务,MimeMessage包括两个可以帮助您获取消息正文text/plain或text/html版本的属性.这些分别是TextBody和HtmlBody.

但请记住,至少对于HtmlBody属性,可能是HTML部分是a的子级multipart/related,允许它引用也包含在该multipart/related实体中的图像和其他类型的媒体.这个属性实际上只是一个方便的属性,并不是自己遍历MIME结构的真正好的替代品,因此您可以正确地解释相关内容.

我会首先检查“MimeMessage”上的“TextBody”属性，但如果消息不包含“text/plain”部分，则它将为空。鉴于此，是的，下一步就是去除 HTML 标签。MimeKit 有一个“HtmlTokenizer”，您可能会发现它对此很有用。 (2认同)

归档时间：	10 年，1 月前
查看次数：	1999 次
最近记录：	10 年，1 月前