在C#中使用Regex解析电子邮件标头

Kev*_*sen 2 c# regex email-parsing

我有一个Webhook发布到Web应用程序上的表单,我需要解析出电子邮件标头地址。

这是源文本:

Thread-Topic: test subject
Thread-Index: AcwE4mK6Jj19Hgi0SV6yYKvj2/HJbw==
From: "Lastname, Firstname" <firstname_lastname@domain.com>
To: <testto@domain.com>, testto1@domain.com, testto2@domain.com
Cc: <testcc@domain.com>, test3@domain.com
X-OriginalArrivalTime: 27 Apr 2011 13:52:46.0235 (UTC) FILETIME=[635226B0:01CC04E2]
Run Code Online (Sandbox Code Playgroud)

我希望提取以下内容:

<testto@domain.com>, testto1@domain.com, testto2@domain.com
Run Code Online (Sandbox Code Playgroud)

我整天都在与Regex挣扎,没有任何运气。

csh*_*net 5

与我必须同意mmutz的一些帖子相反,您不能使用正则表达式来解析电子邮件...请参阅本文:

http://tools.ietf.org/html/rfc2822#section-3.4.1

3.4.1。地址规格

addr-spec是一个特定的Internet标识符,它包含一个本地解释的字符串,后跟一个符号字符(“ @”,ASCII值64),然后是一个Internet域。

“本地解释”的概念意味着仅希望接收服务器能够解析它。

如果要尝试解决此问题,我将找到“ To”行的内容,将其拆分并尝试使用System.Net.Mail.MailAddress解析每个段。

    static void Main()
    {
        string input = @"Thread-Topic: test subject
Thread-Index: AcwE4mK6Jj19Hgi0SV6yYKvj2/HJbw==
From: ""Lastname, Firstname"" <firstname_lastname@domain.com>
To: <testto@domain.com>, ""Yes, this is valid""@[emails are hard to parse!], testto1@domain.com, testto2@domain.com
Cc: <testcc@domain.com>, test3@domain.com
X-OriginalArrivalTime: 27 Apr 2011 13:52:46.0235 (UTC) FILETIME=[635226B0:01CC04E2]";

        Regex toline = new Regex(@"(?im-:^To\s*:\s*(?<to>.*)$)");
        string to = toline.Match(input).Groups["to"].Value;

        int from = 0;
        int pos = 0;
        int found;
        string test;

        while(from < to.Length)
        {
            found = (found = to.IndexOf(',', from)) > 0 ? found : to.Length;
            from = found + 1;
            test = to.Substring(pos, found - pos);

            try
            {
                System.Net.Mail.MailAddress addy = new System.Net.Mail.MailAddress(test.Trim());
                Console.WriteLine(addy.Address);
                pos = found + 1;
            }
            catch (FormatException)
            {
            }
        }
    }
Run Code Online (Sandbox Code Playgroud)

以上程序的输出:

testto@domain.com
"Yes, this is valid"@[emails are hard to parse!]
testto1@domain.com
testto2@domain.com
Run Code Online (Sandbox Code Playgroud)