如何使用正则表达式验证电子邮件地址?

acrosman 3201 regex email validation email-validation string-parsing

多年来,我慢慢开发了一个正则表达式,可以正确验证MOST电子邮件地址,假设他们不使用IP地址作为服务器部分.

我在几个PHP程序中使用它,它大部分时间都可以工作.但是,我不时会遇到使用它的网站遇到问题的人,我最终不得不进行一些调整(最近我意识到我不允许使用4字符TLD).

验证电子邮件的最佳正则表达式是什么?

我已经看到了几个使用函数的解决方案,这些函数使用了几个较短的表达式,但我宁愿在一个简单的函数中使用一个长复杂表达式,而不是在一个更复杂的函数中使用几个短表达式.

bortzmeyer.. 2332

完全RFC 822标准的正则表达式是低效的和模糊的,因为它的长度.幸运的是,RFC 822被取代了两次,目前的电子邮件地址规范是RFC 5322.RFC 5322导致正则表达式,如果研究了几分钟并且对于实际使用而言足够有效,则可以理解该正则表达式.

一个符合RFC 5322标准的正则表达式可以在http://emailregex.com/的页面顶部找到,但是使用在互联网上浮动的IP地址模式,其中包含允许00任何无符号字节十进制值的错误.以点分隔的地址,这是非法的.其余部分似乎与RFC 5322语法一致,并通过多个测试使用grep -Po,包括案例域名,IP地址,坏名称和带引号和不带引号的帐户名称.

纠正00IP模式中的错误,我们获得了一个工作且相当快速的正则表达式.(为实际代码刮取渲染版本,而不是降价.)

(:[A-Z0-9#$%& '*+/= ^ _`{|}〜 - +(?!\ [A-Z0-9#$%&!]'?*+/?= ^ _`{|}〜 - ] +)*|"(?:[\ x01-\X08\X0B\X0C\x0e-\X1F\X21\x23-\x5b\x5d-\0x7F部分] | \\ [\ x01-\X09\X0B\X0C\x0e-\0x7F部分])*")@(:(:[α-Z0-9](:???[A-Z0-9 - ]*[A-Z0 ?-9])\)+ [A-Z0-9](:?[A-Z0-9 - ]*[A-Z0-9])|\[(:( :( 2(5'? [0-5] | [0-4] [0-9])| 1 [0-9] [0-9] |.[1-9] [0-9]))\){3}( ?:( 2(5 [0-5] | [0-4] [0-9])| 1 [0-9] [0-9] | [1-9] [0-9])|〔 A-Z0-9 - ]*[A-Z0-9]:(?:[\ x01-\X08\X0B\X0C\x0e-\X1F\x21-\X5A\x53-\0x7F部分] | \\ [\ x01-\X09\X0B\X0C\x0e-\0x7F部分])+)\])

这是上面的regexp 的有限状态机,它比regexp本身更清晰 在此输入图像描述

Perl和PCRE中更复杂的模式(例如在PHP中使用的正则表达式库)可以正确地解析RFC 5322.Python和C#也可以这样做,但它们使用与前两个不同的语法.但是,如果您被迫使用许多功能较弱的模式匹配语言之一,那么最好使用真正的解析器.

同样重要的是要理解,根据RFC验证它绝对不会告诉您该地址是否实际存在于提供的域中,或者输入该地址的人是否是其真正的所有者.人们一直以这种方式签署其他人到邮件列表.修复需要更高级的验证,该验证涉及向该地址发送包含确认令牌的消息,该确认令牌意味着在与该地址相同的网页上输入.

确认令牌是了解您获得进入该人的地址的唯一方式.这就是为什么现在大多数邮件列表都使用该机制来确认注册.毕竟,任何人都可以放下president@whitehouse.gov,甚至可以解析为合法,但不太可能是另一端的人.

对于PHP,你应该使用给定的模式验证与PHP的电子邮件地址,正道从我引述如下:

存在一些危险,即普通使用和广泛的草率编码将为电子邮件地址建立事实上的标准,其比记录的正式标准更具限制性.

这并不比所有其他非RFC模式更好.它甚至不是足够聪明,甚至处理RFC 822,更不用说RFC 5322 这一个,但是,是.

如果你想得到花哨和迂腐,实现一个完整的状态引擎.正则表达式只能作为基本过滤器.正则表达式的问题在于告诉某人他们完全有效的电子邮件地址是无效的(误报)因为正则表达式无法处理它只是从用户的角度来看是粗鲁和不礼貌的.用于此目的的状态引擎可以验证甚至纠正否则将被视为无效的电子邮件地址,因为它根据每个RFC反汇编电子邮件地址.这样可以带来更愉悦的体验,例如

指定的电子邮件地址"myemail @ address,com"无效.你的意思是'myemail@address.com'吗?

另请参阅验证电子邮件地址,包括注释.或者比较电子邮件地址验证正则表达式.

正则表达式可视化

Debuggex演示

  • 你说"没有好的正则表达." 这是一般的还是特定的电子邮件地址验证? (172认同)
  • @Tomalak:仅限电子邮件地址.正如bortzmeyer所说,RFC非常复杂 (37认同)
  • 你提到的linux期刊文章在几个方面实际上是错误的.特别是Lovell显然没有阅读RFC3696的勘误表,并重复发布的RFC版本中的一些错误.更多信息:http://www.dominicsayers.com/isemail (37认同)
  • Jeff Atwood在这篇博文中有一个可爱的正则表达式来验证所有有效的电子邮件地址:http://www.codinghorror.com/blog/2005/02/regex-use-vs-regex-abuse.html (9认同)
  • 这些脚本似乎不适用于unicode域名 (4认同)
  • 怪物正则表达式以"地址"语法元素为目标,要求用户输入"addr-spec"并将显示名称设置为单独的框是完全合理的.正则表达式是如此之大,因为它必须多次重复addr-spec(并允许折叠空格,你可以简单地要求用户不要使用).您的Web表单不是SMTP服务器,它不必处理"组"或多个等效的[由于空格和显示名称]形式的地址.不允许折叠空格的addr-spec正则表达式最终只有一百个字符左右. (4认同)
  • 请注意[当前的HTML5规范](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-mail-state-(type = email))包​​含正则表达式和ABNF用于电子邮件类型的输入验证,故意比原始RFC更具限制性. (4认同)
  • RFC 822,第6.2.4节。特别明确地允许使用大写字母,但此答案不允许。https://www.w3.org/Protocols/rfc822/#z26也许这个答案的作者打算不加区分地应用其正则表达式。如果是这样,应在答案正文中明确指出。 (4认同)
  • 如果"没有好的正则表达式",那么为什么[这个答案](http://stackoverflow.com/questions/201323/what-is-the-best-regular-expression-for-validating-email-addresses/ 1917982#1917982)似乎已经管理好了吗? (3认同)
  • 大多数RFC5322与此问题无关.因为RFC还描述了如何格式化邮箱的多个地址列表或某些元数据(例如:显示名称).其他答案中大多数"巨大的正则表达式"只提供了几乎完整的RFC5322 regexp,并且也是如此无关紧要. (3认同)
  • 这个正则表达式容易受到JavaScript(node/V8)中的灾难性回溯攻击.我没有测试其他语言. (3认同)
  • 或者,请遵循[HTML5规范的电子邮件地址定义](http://www.w3.org/TR/html5/forms.html#valid-e-mail-address).它不同意RFC,但同意现实世界的使用.它很容易验证. (2认同)

SLaks.. 739

您不应使用正则表达式来验证电子邮件地址.

而是使用MailAddress类,如下所示:

try {
    address = new MailAddress(address).Address;
} catch(FormatException) {
    //address is invalid
}

MailAddress类使用BNF解析器完全按照RFC822验证地址.

如果你真的想使用正则表达式,这里是:

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:
\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(
?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ 
\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\0
31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\
](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+
(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:
(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)
?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\
r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[
 \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)
?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t]
)*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[
 \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*
)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)
*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+
|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r
\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:
\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t
]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031
]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](
?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?
:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?
:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?
:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?
[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|
\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>

@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"
(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?
:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[
\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-
\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(
?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;
:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([
^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\"
.\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\
]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\
[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\
r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]
|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \0
00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\
.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,
;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?
:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[
^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]
]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(
?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(
?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[
\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t
])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t
])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?
:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|
\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:
[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\
]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)
?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["
()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)
?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>

@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[
 \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,
;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:
\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[
"()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])
*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])
+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\
.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(
?:\r\n)?[ \t])*))*)?;\s*)

  • 您会发现.NET 4.0中的MailAddress类在验证电子邮件地址方面要比以前的版本好得多.我对它做了一些重大改进. (25认同)
  • @MatthewLock:**No.**您需要_escape_ SQL查询(或者,更好的是,使用参数).消毒不是一种适当的防御措施. (10认同)
  • @MatthewLock:这并不比`fake @ not-a-real-domain.name`更糟糕.你**不能**依靠电子邮件验证来阻止XSS. (9认同)
  • 我认为它有点......不起作用......对于更简单的ID.a @ b不验证.ar@b.com只匹配ar @ b,.com不匹配.但是,"我是我"@ [10.10.10.10]之类的东西确实有效!:) (7认同)
  • 请注意,这些符合RFC的正则表达式验证器会通过许多您可能不想接受的电子邮件地址,例如"a <body/onload = alert('http://lol.com?'++document.cookies )@aa>"这是perl的Email :: Valid(使用那个巨大的正则表达式)中的有效电子邮件地址,可以被用于XSS https://rt.cpan.org/Public/Bug/Display.html?id = 75650 (5认同)
  • 你确定它是正确的吗?http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-email-address-until-i.aspx (3认同)
  • 顺便说一下,正则表达式是一个糟糕的答案.它不处理注释(如代码所述),也适用于RFC822,它已过时. (3认同)
  • 仅供参考:Microsoft确实在[如何:验证字符串是否为有效电子邮件格式]中为此任务提供"推荐"正则表达式(https://msdn.microsoft.com/en-us/library/01escwtf.aspx) .但是,在他们解释了RegEx如何工作之后,他们会抛出"而不是使用正则表达式来验证电子邮件地址,您可以使用System.Net.Mail.MailAddress类." :) (3认同)
  • 交叉发布到LISP和PERL论坛,然后看火花飞. (2认同)
  • 仅仅因为这是一个规范的答案:这个正则表达式不验证电子邮件地址.它验证To/Bcc字段,即"My Name Is <name@mail.com>"之类的字符串. (2认同)

JacquesB.. 528

这个问题被问了很多,但我认为你应该退后一步,问问自己为什么要在语法上验证电子邮件地址?真的有什么好处?

  • 它不会捕捉常见的拼写错误.
  • 它不会阻止人们输入无效或虚构的电子邮件地址,也无法输入其他人的地址.

如果您想验证电子邮件是否正确,您别无选择,只能发送确认电子邮件并让用户回复.在许多情况下,出于安全原因或出于道德原因,您将不得不发送确认邮件(因此您不能违背他们的意愿签署某人服务).

  • @olavk:如果有人输入拼写错误(例如:'me @ hotmail`),他们显然不会收到您的确认电子邮件,那么他们在哪里?他们不再在您的网站上了,他们想知道为什么他们无法注册.其实不是他们不是 - 他们完全忘记了你.但是,如果你只是在他们还在你身边的时候用正则表达式做一个基本的健全性检查,那么他们可以立即发现这个错误并且你有一个快乐的用户. (99认同)
  • 可能值得检查一下,为了捕捉到简单的错误,他们在客户端验证中输入了@某些东西 - 但总的来说你是对的. (88认同)
  • 它不一定是黑色或白色.如果电子邮件看起来不对,请让用户知道.如果用户仍想继续,请告诉他.不要强迫用户遵守你的正则表达式,而是使用正则表达式作为工具来帮助用户知道可能存在错误. (38认同)
  • 马丁,我给了你+1,但后来才知道foobar @ dk是一封有效的电子邮件.它不会很漂亮,但是如果你想要兼容RFC并使用常识,你应该检测这样的情况并要求用户确认这是正确的. (7认同)
  • @JacquesB:你提出了一个很好的观点.仅仅因为它通过RFC的通过并不意味着它真的是用户的地址.否则所有那些"总统@ whitehouse.gov"的地址都表明他们是一个非常网络的总司令.:) (5认同)
  • @nickf:从技术上讲,没有必要使用顶级域名,我的@ hotmail是有效的.比正则表达式更好的解决方案是拥有一个常见的电子邮件提供商列表(hotmail.com,me.com,gmail.com,yahoo.com)并在地址中搜索拼写错误. (2认同)

Good Person.. 336

这取决于你最好的意思:如果你正在谈论捕获每个有效的电子邮件地址,请使用以下内容:

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:
\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(
?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ 
\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\0
31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\
](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+
(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:
(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)
?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\
r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[
 \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)
?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t]
)*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[
 \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*
)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)
*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+
|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r
\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:
\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t
]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031
]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](
?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?
:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?
:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?
:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?
[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|
\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>
@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"
(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?
:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[
\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-
\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(
?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;
:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([
^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\"
.\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\
]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\
[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\
r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]
|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \0
00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\
.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,
;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?
:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[
^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]
]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(
?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(
?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[
\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t
])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t
])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?
:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|
\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:
[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\
]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)
?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["
()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)
?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>
@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[
 \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,
;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:
\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[
"()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])
*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])
+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\
.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(
?:\r\n)?[ \t])*))*)?;\s*)

(http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html)如果您正在寻找更简单但可以捕获大多数有效电子邮件地址的内容,请尝试以下方法:

"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"

编辑:从链接:

此正则表达式仅验证已删除任何注释并用空格替换的地址(这由模块完成).

  • 你能给我一个错误地通过第二个的"电子邮件地址"的例子,但是被更长的正则表达式所捕获? (47认同)
  • @Lazer in..valid @ example.com就是一个简单的例子.您不允许在本地部分中连续两个未加引号的点. (24认同)
  • 它与所有地址都不匹配,有些必须首先进行转换.从链接:"这个正则表达式将只验证已删除任何注释并用空格替换的地址(这由模块完成)." (10认同)
  • @Mikhail perl但你实际上不应该使用它. (5认同)
  • 虽然我曾经爱过它,但这是一个RFC 822验证器,而不是[RFC 5322](http://stackoverflow.com/questions/201323/what-is-the-best-regular-expression-for-validating-email -addresses/1917982#1917982)一个. (4认同)
  • @RSC这是一个很好的FQDN (3认同)
  • 另外,@ a将传递第一个正则表达式但在第二个正则表达式中失败.是的,一些真正的电子邮件没有.@之后 (2认同)
  • 你也可以在@之前引用你想要的任何内容,例如:"someomething @ gmail.com"@ foo.com是一个存储在域名foo.com中的帐户名为something@gmail.com的地址 (2认同)

Andy Lester.. 331

这一切都取决于你想要的准确程度.就我的目的而言,我只是试图阻止诸如bob @ aol.com(电子邮件中的空格)或steve(根本没有域名)或mary@aolcom(在.com之前没有时间段)之类的东西,我使用

/^\S+@\S+\.\S+$/

当然,它将匹配无效电子邮件地址的内容,但这是播放90/10规则的问题.

  • JJJ:是的,它会匹配很多垃圾.它将匹配&$*#$(@ $ 0(%))$#.)&*)(*$,对我来说,我更关心的是捕捉像mary @ aolcom这样的奇怪的手指错字.我完全是垃圾.YMMV. (40认同)
  • @Richard:`.`包含在`\ S`中. (7认同)
  • 它与foobar @ dk不匹配,这是一个有效且有效的电子邮件地址(尽管大多数邮件服务器可能不会接受它或者会添加something.com.) (6认同)
  • 只是控制`@`标志:`/ ^ [^\s @] + @ [^\s @] + \.[^\s @] {2,} $ /`http://jsfiddle.net/ b9chris/mXB96 / (5认同)
  • 是的,它会的.我建议你自己尝试一下.$ perl -le'print q{foo@bar.co.uk} =〜/^\S+@\S+\.\S+$/?q {Y}:q {N}' (3认同)
  • 另一个常见的错字:域名中两个连续的点或一个逗号而不是一个点。`^ [^ \ s @] + @([^ \ s @。,] + \。)+ [^ \ s @。,] {2,} $` (2认同)

Dominic Saye.. 287

[更新]我在这里整理了我所知道的关于电子邮件地址验证的所有内容:http://isemail.info,它现在不仅可以验证,还可以诊断电子邮件地址的问题.我同意这里的许多评论,验证只是答案的一部分; 请参阅我的论文http://isemail.info/about.

据我所知,is_email()仍然是唯一能够明确告诉你给定字符串是否是有效电子邮件地址的验证器.我在http://isemail.info/上传了一个新版本

我整理了来自Cal Henderson,Dave Child,Phil Haack,Doug Lovell,RFC5322和RFC 3696的测试用例.总共有275个测试地址.我针对我能找到的所有免费验证器运行了所有这些测试.

随着人们增强验证器,我会尽量使这个页面保持最新状态.感谢Cal,Michael,Dave,Paul和Phil在编写这些测试和对我自己的验证器的建设性批评方面提供的帮助和合作.

人们应该特别注意针对RFC 3696勘误表.其中三个规范示例实际上是无效地址.并且地址的最大长度为254或256个字符,而不是 320个字符.


Josh Stodola.. 263

根据W3C HTML5规范:

^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$

语境:

一个有效的电子邮件地址是该ABNF生产[...]相匹配的字符串.

注意:此要求是对RFC 5322故意违反,RFC 5322定义了电子邮件地址的语法同时过于严格(在"@"字符之前),过于模糊(在"@"字符之后),并且过于宽松(允许在大多数用户不熟悉的方式中使用注释,空白字符和引用的字符串)在这里具有实际用途.

以下JavaScript和Perl兼容的正则表达式是上述定义的实现.

/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/

  • 这是有趣的.这是对RFC的违反,但它是一个故意的,它使sesne.真实世界的例子:gmail在@之前忽略了部分中的点,所以如果你的电子邮件是test@gmail.com,你可以发送电子邮件进行测试.@ gmail.com或者测试.... @ gmail.com,这两个地址都是根据RFC无效,但在现实世界中有效. (12认同)
  • @mmmmmm`john.doe @ localhost`有效.当然,在现实世界的应用程序(即社区)中,我希望您建议用+替换* (7认同)
  • @valentinas实际上,RFC不会*排除这些本地部分,但必须引用它们.``test ...."@ gmail.com`根据RFC完全有效,在语义上等同于`test .... @ gmail.com`. (3认同)

小智.. 200

在Perl 5.10或更新版本中很容易:

/(?(DEFINE)
   (?<address>         (?&mailbox) | (?&group))
   (?<mailbox>         (?&name_addr) | (?&addr_spec))
   (?<name_addr>       (?&display_name)? (?&angle_addr))
   (?<angle_addr>      (?&CFWS)? < (?&addr_spec) > (?&CFWS)?)
   (?<group>           (?&display_name) : (?:(?&mailbox_list) | (?&CFWS))? ;
                                          (?&CFWS)?)
   (?<display_name>    (?&phrase))
   (?<mailbox_list>    (?&mailbox) (?: , (?&mailbox))*)

   (?<addr_spec>       (?&local_part) \@ (?&domain))
   (?<local_part>      (?&dot_atom) | (?&quoted_string))
   (?<domain>          (?&dot_atom) | (?&domain_literal))
   (?<domain_literal>  (?&CFWS)? \[ (?: (?&FWS)? (?&dcontent))* (?&FWS)?
                                 \] (?&CFWS)?)
   (?<dcontent>        (?&dtext) | (?&quoted_pair))
   (?<dtext>           (?&NO_WS_CTL) | [\x21-\x5a\x5e-\x7e])

   (?<atext>           (?&ALPHA) | (?&DIGIT) | [!#\$%&'*+-/=?^_`{|}~])
   (?<atom>            (?&CFWS)? (?&atext)+ (?&CFWS)?)
   (?<dot_atom>        (?&CFWS)? (?&dot_atom_text) (?&CFWS)?)
   (?<dot_atom_text>   (?&atext)+ (?: \. (?&atext)+)*)

   (?<text>            [\x01-\x09\x0b\x0c\x0e-\x7f])
   (?<quoted_pair>     \\ (?&text))

   (?<qtext>           (?&NO_WS_CTL) | [\x21\x23-\x5b\x5d-\x7e])
   (?<qcontent>        (?&qtext) | (?&quoted_pair))
   (?<quoted_string>   (?&CFWS)? (?&DQUOTE) (?:(?&FWS)? (?&qcontent))*
                        (?&FWS)? (?&DQUOTE) (?&CFWS)?)

   (?<word>            (?&atom) | (?&quoted_string))
   (?<phrase>          (?&word)+)

   # Folding white space
   (?<FWS>             (?: (?&WSP)* (?&CRLF))? (?&WSP)+)
   (?<ctext>           (?&NO_WS_CTL) | [\x21-\x27\x2a-\x5b\x5d-\x7e])
   (?<ccontent>        (?&ctext) | (?&quoted_pair) | (?&comment))
   (?<comment>         \( (?: (?&FWS)? (?&ccontent))* (?&FWS)? \) )
   (?<CFWS>            (?: (?&FWS)? (?&comment))*
                       (?: (?:(?&FWS)? (?&comment)) | (?&FWS)))

   # No whitespace control
   (?<NO_WS_CTL>       [\x01-\x08\x0b\x0c\x0e-\x1f\x7f])

   (?<ALPHA>           [A-Za-z])
   (?<DIGIT>           [0-9])
   (?<CRLF>            \x0d \x0a)
   (?<DQUOTE>          ")
   (?<WSP>             [\x20\x09])
 )

 (?&address)/x

  • 很想在Python中看到这个 (19认同)
  • 正常情况下,正则表达式已经停止了.这是一个有效的Perl'正则表达式'! (9认同)
  • 我认为只有`addrspec`部分的一部分与问题真正相关.接受更多并通过系统的其他部分转发它并不准备接受完整的RFC5822地址就像射击是你自己的脚. (4认同)
  • 我在IDEone上为这个正则表达式设置了一个测试:http://ideone.com/2XFecH然而,它并不公平"完美".有人会关心吗?我错过了什么吗? (4认同)
  • 伟大的(+1),但从技术上讲,它当然不是正则表达式...(由于语法不规则,这是不可能的). (3认同)

Per Hornshøj.. 155

我用

^\w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$

这是由RegularExpressionValidator在ASP.NET中使用的.

  • 所以基本上,它不允许荒谬的电子邮件地址.:) (91认同)
  • 嘘!我的(不明智的)地址`!@ mydomain.net`被拒绝了. (28认同)
  • @Wayne Whitty.除了测试电子邮件验证之外,您已经遇到了是否要满足绝大多数地址或所有地址(包括没有人会使用的地址)的主要问题. (4认同)
  • 根据这个页面http://data.iana.org/TLD/tlds-alpha-by-domain.txt,顶层没有只有一个字符的域名,例如"something.c","something.a",这里是支持至少2个字符的版本:"something.pl","something.us":`^ \\ w +([ - +.'] \\ w +)*@ \\ w +([ - .] \\瓦特+)*\\瓦特{2,}([ - ] \\ W +)*$` (3认同)
  • 这在“ simon-@hotmail.com”上失败,实际上是有效的(我们的客户有类似的地址) (2认同)

Chris Vest.. 141

不知道最好的,但这个至少是正确的,只要地址的注释被剥离并用空格替换.

认真.您应该使用已编写的库来验证电子邮件.最好的方法可能是只发送一封验证电子邮件到该地址.

  • 最终,他是正确的,真正*验证*电子邮件地址的唯一方法是向其发送电子邮件并等待回复. (12认同)
  • 这是RFC 822规范,而不是[RFC 5322](http://stackoverflow.com/questions/201323/what-is-the-best-regular-expression-for-validating-email-addresses/1917982#1917982)规范. (7认同)
  • 据我所知,一些图书馆也是错的.我依稀记得PHP PEAR有这样的错误. (2认同)

davcar.. 109

我想要验证的电子邮件地址将由使用System.Net.Mail命名空间的ASP.NET Web应用程序用于将电子邮件发送到人员列表.因此,我只是尝试从地址创建一个MailAddress实例,而不是使用一些非常复杂的正则表达式.如果地址格式不正确,MailAddress构造函数将抛出异常.这样,我知道我至少可以把电子邮件打开门.当然这是服务器端验证,但至少你需要它.

protected void emailValidator_ServerValidate(object source, ServerValidateEventArgs args)
{
    try
    {
        var a = new MailAddress(txtEmail.Text);
    }
    catch (Exception ex)
    {
        args.IsValid = false;
        emailValidator.ErrorMessage = "email: " + ex.Message;
    }
}

  • 好点.即使此服务器验证拒绝某些有效地址,也不会出现问题,因为无论如何您都无法使用此特定服务器技术发送到此地址.或者您可以尝试使用您使用的任何第三方电子邮件库而不是默认工具执行相同的操作. (3认同)
  • 请注意:MailAddress类与RFC5322不匹配,如果您只是想将其用于验证(而不是发送,在这种情况下,如上所述,这是一个有争议的问题).请参阅:http://stackoverflow.com/questions/6023589/how-to-make-regex-work-on-corner-case-email-address/6023604#6023604 (2认同)

Rinke.. 108

Quick answer

Use the following regex for input validation:

([-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|"([]!#-[^-~ \t]|(\\[\t -~]))+")@[0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?(\.[0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?)+

Addresses matched by this regex:

  • have a local part (i.e. the part before the @-sign) that is strictly compliant with RFC 5321/5322,
  • have a domain part (i.e. the part after the @-sign) that is a host name with at least two labels, each of which is at most 63 characters long.

The second constraint is a restriction on RFC 5321/5322.

Elaborate answer

Using a regular expression that recognizes email addresses could be useful in various situations: for example to scan for email addresses in a document, to validate user input, or as an integrity constraint on a data repository.

It should however be noted that if you want to find out if the address actually refers to an existing mailbox, there's no substitute for sending a message to the address. If you only want to check if an address is grammatically correct then you could use a regular expression, but note that ""@[] is a grammatically correct email address that certainly doesn't refer to an existing mailbox.

The syntax of email addresses has been defined in various RFCs, most notably RFC 822 and RFC 5322. RFC 822 should be seen as the "original" standard and RFC 5322 as the latest standard. The syntax defined in RFC 822 is the most lenient and subsequent standards have restricted the syntax further and further, where newer systems or services should recognize obsolete syntax, but never produce it.

In this answer I’ll take "email address" to mean addr-spec as defined in the RFCs (i.e. jdoe@example.org, but not "John Doe"<jdoe@example.org>, nor some-group:jdoe@example.org,mrx@exampel.org;).

There's one problem with translating the RFC syntaxes into regexes: the syntaxes are not regular! This is because they allow for optional comments in email addresses that can be infinitely nested, while infinite nesting can't be described by a regular expression. To scan for or validate addresses containing comments you need a parser or more powerful expressions. (Note that languages like Perl have constructs to describe context free grammars in a regex-like way.) In this answer I'll disregard comments and only consider proper regular expressions.

RFC定义电子邮件消息的语法,而不是电子邮件地址的语法.地址可能出现在各种标题字段中,这是它们主要定义的位置.当它们出现在头字段中时,地址可能包含(在词法标记之间)空格,注释甚至换行符.从语义上讲,这没有任何意义.通过从地址中删除此空格等,您将获得语义上等效的规范表示.因此,规范表示first. last (comment) @ [3.5.7.9]first.last@[3.5.7.9].

Different syntaxes should be used for different purposes. If you want to scan for email addresses in a (possibly very old) document it may be a good idea to use the syntax as defined in RFC 822. On the other hand, if you want to validate user input you may want to use the syntax as defined in RFC 5322, probably only accepting canonical representations. You should decide which syntax applies to your specific case.

I use POSIX "extended" regular expressions in this answer, assuming an ASCII compatible character set.

RFC 822

I arrived at the following regular expression. I invite everyone to try and break it. If you find any false positives or false negatives, please post them in a comment and I'll try to fix the expression as soon as possible.

([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]))*(\\\r)*")(\.([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]))*(\\\r)*"))*@([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]))*(\\\r)*])(\.([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]))*(\\\r)*]))*

I believe it's fully complient with RFC 822 including the errata. It only recognizes email addresses in their canonical form. For a regex that recognizes (folding) whitespace see the derivation below.

The derivation shows how I arrived at the expression. I list all the relevant grammar rules from the RFC exactly as they appear, followed by the corresponding regex. Where an erratum has been published I give a separate expression for the corrected grammar rule (marked "erratum") and use the updated version as a subexpression in subsequent regular expressions.

As stated in paragraph 3.1.4. of RFC 822 optional linear white space may be inserted between lexical tokens. Where applicable I've expanded the expressions to accommodate this rule and marked the result with "opt-lwsp".

CHAR        =  <any ASCII character>
            =~ .

CTL         =  <any ASCII control character and DEL>
            =~ [\x00-\x1F\x7F]

CR          =  <ASCII CR, carriage return>
            =~ \r

LF          =  <ASCII LF, linefeed>
            =~ \n

SPACE       =  <ASCII SP, space>
            =~  

HTAB        =  <ASCII HT, horizontal-tab>
            =~ \t

<">         =  <ASCII quote mark>
            =~ "

CRLF        =  CR LF
            =~ \r\n

LWSP-char   =  SPACE / HTAB
            =~ [ \t]

linear-white-space =  1*([CRLF] LWSP-char)
                   =~ ((\r\n)?[ \t])+

specials    =  "(" / ")" / "<" / ">" / "@" /  "," / ";" / ":" / "\" / <"> /  "." / "[" / "]"
            =~ [][()<>@,;:\\".]

quoted-pair =  "\" CHAR
            =~ \\.

qtext       =  <any CHAR excepting <">, "\" & CR, and including linear-white-space>
            =~ [^"\\\r]|((\r\n)?[ \t])+

dtext       =  <any CHAR excluding "[", "]", "\" & CR, & including linear-white-space>
            =~ [^][\\\r]|((\r\n)?[ \t])+

quoted-string  =  <"> *(qtext|quoted-pair) <">
               =~ "([^"\\\r]|((\r\n)?[ \t])|\\.)*"
(erratum)      =~ "(\n|(\\\r)*([^"\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*"

domain-literal =  "[" *(dtext|quoted-pair) "]"
               =~ \[([^][\\\r]|((\r\n)?[ \t])|\\.)*]
(erratum)      =~ \[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*]

atom        =  1*<any CHAR except specials, SPACE and CTLs>
            =~ [^][()<>@,;:\\". \x00-\x1F\x7F]+

word        =  atom / quoted-string
            =~ [^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*"

domain-ref  =  atom

sub-domain  =  domain-ref / domain-literal
            =~ [^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*]

local-part  =  word *("." word)
            =~ ([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*")(\.([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*"))*
(opt-lwsp)  =~ ([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*")(((\r\n)?[ \t])*\.((\r\n)?[ \t])*([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*"))*

domain      =  sub-domain *("." sub-domain)
            =~ ([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*])(\.([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*]))*
(opt-lwsp)  =~ ([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*])(((\r\n)?[ \t])*\.((\r\n)?[ \t])*([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*]))*

addr-spec   =  local-part "@" domain
            =~ ([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*")(\.([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*"))*@([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*])(\.([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*]))*
(opt-lwsp)  =~ ([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*")((\r\n)?[ \t])*(\.((\r\n)?[ \t])*([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*")((\r\n)?[ \t])*)*@((\r\n)?[ \t])*([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*])(((\r\n)?[ \t])*\.((\r\n)?[ \t])*([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]|(\r\n)?[ \t]))*(\\\r)*]))*
(canonical) =~ ([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]))*(\\\r)*")(\.([^][()<>@,;:\\". \x00-\x1F\x7F]+|"(\n|(\\\r)*([^"\\\r\n]|\\[^\r]))*(\\\r)*"))*@([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]))*(\\\r)*])(\.([^][()<>@,;:\\". \x00-\x1F\x7F]+|\[(\n|(\\\r)*([^][\\\r\n]|\\[^\r]))*(\\\r)*]))*

RFC 5322

I arrived at the following regular expression. I invite everyone to try and break it. If you find any false positives or false negatives, please post them in a comment and I'll try to fix the expression as soon as possible.

([-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|"([]!#-[^-~ \t]|(\\[\t -~]))+")@([-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|\[[\t -Z^-~]*])

I believe it's fully complient with RFC 5322 including the errata. It only recognizes email addresses in their canonical form. For a regex that recognizes (folding) whitespace see the derivation below.

The derivation shows how I arrived at the expression. I list all the relevant grammar rules from the RFC exactly as they appear, followed by the corresponding regex. For rules that include semantically irrelevant (folding) whitespace, I give a separate regex marked "(normalized)" that doesn't accept this whitespace.

I ignored all the "obs-" rules from the RFC. This means that the regexes only match email addresses that are strictly RFC 5322 compliant. If you have to match "old" addresses (as the looser grammar including the "obs-" rules does), you can use one of the RFC 822 regexes from the previous paragraph.

VCHAR           =   %x21-7E
                =~  [!-~]

ALPHA           =   %x41-5A / %x61-7A
                =~  [A-Za-z]

DIGIT           =   %x30-39
                =~  [0-9]

HTAB            =   %x09
                =~  \t

CR              =   %x0D
                =~  \r

LF              =   %x0A
                =~  \n

SP              =   %x20
                =~  

DQUOTE          =   %x22
                =~  "

CRLF            =   CR LF
                =~  \r\n

WSP             =   SP / HTAB
                =~  [\t ]

quoted-pair     =   "\" (VCHAR / WSP)
                =~  \\[\t -~]

FWS             =   ([*WSP CRLF] 1*WSP)
                =~  ([\t ]*\r\n)?[\t ]+

ctext           =   %d33-39 / %d42-91 / %d93-126
                =~  []!-'*-[^-~]

("comment" is left out in the regex)
ccontent        =   ctext / quoted-pair / comment
                =~  []!-'*-[^-~]|(\\[\t -~])

(not regular)
comment         =   "(" *([FWS] ccontent) [FWS] ")"

(is equivalent to FWS when leaving out comments)
CFWS            =   (1*([FWS] comment) [FWS]) / FWS
                =~  ([\t ]*\r\n)?[\t ]+

atext           =   ALPHA / DIGIT / "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "/" / "=" / "?" / "^" / "_" / "`" / "{" / "|" / "}" / "~"
                =~  [-!#-'*+/-9=?A-Z^-~]

dot-atom-text   =   1*atext *("." 1*atext)
                =~  [-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*

dot-atom        =   [CFWS] dot-atom-text [CFWS]
                =~  (([\t ]*\r\n)?[\t ]+)?[-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*(([\t ]*\r\n)?[\t ]+)?
(normalized)    =~  [-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*

qtext           =   %d33 / %d35-91 / %d93-126
                =~  []!#-[^-~]

qcontent        =   qtext / quoted-pair
                =~  []!#-[^-~]|(\\[\t -~])

(erratum)
quoted-string   =   [CFWS] DQUOTE ((1*([FWS] qcontent) [FWS]) / FWS) DQUOTE [CFWS]
                =~  (([\t ]*\r\n)?[\t ]+)?"(((([\t ]*\r\n)?[\t ]+)?([]!#-[^-~]|(\\[\t -~])))+(([\t ]*\r\n)?[\t ]+)?|(([\t ]*\r\n)?[\t ]+)?)"(([\t ]*\r\n)?[\t ]+)?
(normalized)    =~  "([]!#-[^-~ \t]|(\\[\t -~]))+"

dtext           =   %d33-90 / %d94-126
                =~  [!-Z^-~]

domain-literal  =   [CFWS] "[" *([FWS] dtext) [FWS] "]" [CFWS]
                =~  (([\t ]*\r\n)?[\t ]+)?\[((([\t ]*\r\n)?[\t ]+)?[!-Z^-~])*(([\t ]*\r\n)?[\t ]+)?](([\t ]*\r\n)?[\t ]+)?
(normalized)    =~  \[[\t -Z^-~]*]

local-part      =   dot-atom / quoted-string
                =~  (([\t ]*\r\n)?[\t ]+)?[-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*(([\t ]*\r\n)?[\t ]+)?|(([\t ]*\r\n)?[\t ]+)?"(((([\t ]*\r\n)?[\t ]+)?([]!#-[^-~]|(\\[\t -~])))+(([\t ]*\r\n)?[\t ]+)?|(([\t ]*\r\n)?[\t ]+)?)"(([\t ]*\r\n)?[\t ]+)?
(normalized)    =~  [-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|"([]!#-[^-~ \t]|(\\[\t -~]))+"

domain          =   dot-atom / domain-literal
                =~  (([\t ]*\r\n)?[\t ]+)?[-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*(([\t ]*\r\n)?[\t ]+)?|(([\t ]*\r\n)?[\t ]+)?\[((([\t ]*\r\n)?[\t ]+)?[!-Z^-~])*(([\t ]*\r\n)?[\t ]+)?](([\t ]*\r\n)?[\t ]+)?
(normalized)    =~  [-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|\[[\t -Z^-~]*]

addr-spec       =   local-part "@" domain
                =~  ((([\t ]*\r\n)?[\t ]+)?[-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*(([\t ]*\r\n)?[\t ]+)?|(([\t ]*\r\n)?[\t ]+)?"(((([\t ]*\r\n)?[\t ]+)?([]!#-[^-~]|(\\[\t -~])))+(([\t ]*\r\n)?[\t ]+)?|(([\t ]*\r\n)?[\t ]+)?)"(([\t ]*\r\n)?[\t ]+)?)@((([\t ]*\r\n)?[\t ]+)?[-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*(([\t ]*\r\n)?[\t ]+)?|(([\t ]*\r\n)?[\t ]+)?\[((([\t ]*\r\n)?[\t ]+)?[!-Z^-~])*(([\t ]*\r\n)?[\t ]+)?](([\t ]*\r\n)?[\t ]+)?)
(normalized)    =~  ([-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|"([]!#-[^-~ \t]|(\\[\t -~]))+")@([-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|\[[\t -Z^-~]*])

Note that some sources (notably w3c) claim that RFC 5322 is too strict on the local part (i.e. the part before the @-sign). This is because "..", "a..b" and "a." are not valid dot-atoms, while they may be used as mailbox names. The RFC, however, does allow for local parts like these, except that they have to be quoted. So instead of a..b@example.net you should write "a..b"@example.net, which is semantically equivalent.

Further restrictions

SMTP (as defined in RFC 5321) further restricts the set of valid email addresses (or actually: mailbox names). It seems reasonable to impose this stricter grammar, so that the matched email address can actually be used to send an email.

RFC 5321 basically leaves alone the "local" part (i.e. the part before the @-sign), but is stricter on the domain part (i.e. the part after the @-sign). It allows only host names in place of dot-atoms and address literals in place of domain literals.

The grammar presented in RFC 5321 is too lenient when it comes to both host names and IP addresses. I took the liberty of "correcting" the rules in question, using this draft and RFC 1034 as guidelines. Here's the resulting regex.

([-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|"([]!#-[^-~ \t]|(\\[\t -~]))+")@([0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?(\.[0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?)*|\[((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}|IPv6:((((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){6}|::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){5}|[0-9A-Fa-f]{0,4}::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){4}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):)?(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){3}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,2}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){2}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,3}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,4}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::)((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3})|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,5}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,6}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::)|(?!IPv6:)[0-9A-Za-z-]*[0-9A-Za-z]:[!-Z^-~]+)])

Note that depending on the use case you may not want to allow for a "General-address-literal" in your regex. Also note that I used a negative lookahead (?!IPv6:) in the final regex to prevent the "General-address-literal" part to match malformed IPv6 addresses. Some regex processors don't support negative lookahead. Remove the substring |(?!IPv6:)[0-9A-Za-z-]*[0-9A-Za-z]:[!-Z^-~]+ from the regex if you want to take the whole "General-address-literal" part out.

Here's the derivation:

Let-dig         =   ALPHA / DIGIT
                =~  [0-9A-Za-z]

Ldh-str         =   *( ALPHA / DIGIT / "-" ) Let-dig
                =~  [0-9A-Za-z-]*[0-9A-Za-z]

(regex is updated to make sure sub-domains are max. 63 charactes long - RFC 1034 section 3.5)
sub-domain      =   Let-dig [Ldh-str]
                =~  [0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?

Domain          =   sub-domain *("." sub-domain)
                =~  [0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?(\.[0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?)*

Snum            =   1*3DIGIT
                =~  [0-9]{1,3}

(suggested replacement for "Snum")
ip4-octet       =   DIGIT / %x31-39 DIGIT / "1" 2DIGIT / "2" %x30-34 DIGIT / "25" %x30-35
                =~  25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9]

IPv4-address-literal    =   Snum 3("."  Snum)
                        =~  [0-9]{1,3}(\.[0-9]{1,3}){3}

(suggested replacement for "IPv4-address-literal")
ip4-address     =   ip4-octet 3("." ip4-octet)
                =~  (25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}

(suggested replacement for "IPv6-hex")
ip6-h16         =   "0" / ( (%x49-57 / %x65-70 /%x97-102) 0*3(%x48-57 / %x65-70 /%x97-102) )
                =~  0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}

(not from RFC)
ls32            =   ip6-h16 ":" ip6-h16 / ip4-address
                =~  (0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}

(suggested replacement of "IPv6-addr")
ip6-address     =                                      6(ip6-h16 ":") ls32
                    /                             "::" 5(ip6-h16 ":") ls32
                    / [                 ip6-h16 ] "::" 4(ip6-h16 ":") ls32
                    / [ *1(ip6-h16 ":") ip6-h16 ] "::" 3(ip6-h16 ":") ls32
                    / [ *2(ip6-h16 ":") ip6-h16 ] "::" 2(ip6-h16 ":") ls32
                    / [ *3(ip6-h16 ":") ip6-h16 ] "::"   ip6-h16 ":"  ls32
                    / [ *4(ip6-h16 ":") ip6-h16 ] "::"                ls32
                    / [ *5(ip6-h16 ":") ip6-h16 ] "::"   ip6-h16
                    / [ *6(ip6-h16 ":") ip6-h16 ] "::"
                =~  (((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){6}|::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){5}|[0-9A-Fa-f]{0,4}::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){4}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):)?(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){3}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,2}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){2}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,3}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,4}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::)((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3})|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,5}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,6}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::

IPv6-address-literal    =   "IPv6:" ip6-address
                        =~  IPv6:((((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){6}|::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){5}|[0-9A-Fa-f]{0,4}::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){4}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):)?(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){3}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,2}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){2}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,3}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,4}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::)((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3})|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,5}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,6}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::)

Standardized-tag        =   Ldh-str
                        =~  [0-9A-Za-z-]*[0-9A-Za-z]

dcontent        =   %d33-90 / %d94-126
                =~  [!-Z^-~]

General-address-literal =   Standardized-tag ":" 1*dcontent
                        =~  [0-9A-Za-z-]*[0-9A-Za-z]:[!-Z^-~]+

address-literal =   "[" ( IPv4-address-literal / IPv6-address-literal / General-address-literal ) "]"
                =~  \[((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}|IPv6:((((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){6}|::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){5}|[0-9A-Fa-f]{0,4}::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){4}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):)?(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){3}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,2}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){2}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,3}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,4}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::)((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3})|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,5}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,6}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::)|(?!IPv6:)[0-9A-Za-z-]*[0-9A-Za-z]:[!-Z^-~]+)]

Mailbox         =   Local-part "@" ( Domain / address-literal )
                =~  ([-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|"([]!#-[^-~ \t]|(\\[\t -~]))+")@([0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?(\.[0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?)*|\[((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}|IPv6:((((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){6}|::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){5}|[0-9A-Fa-f]{0,4}::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){4}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):)?(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){3}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,2}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){2}|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,3}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,4}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::)((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3})|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,5}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3})|(((0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}):){0,6}(0|[1-9A-Fa-f][0-9A-Fa-f]{0,3}))?::)|(?!IPv6:)[0-9A-Za-z-]*[0-9A-Za-z]:[!-Z^-~]+)])

User input validation

A common use case is user input validation, for example on an html form. In that case it's usually reasonable to preclude address-literals and to require at least two labels in the hostname. Taking the improved RFC 5321 regex from the previous section as a basis, the resulting expression would be:

([-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|"([]!#-[^-~ \t]|(\\[\t -~]))+")@[0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?(\.[0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?)+

I do not recommend restricting the local part further, e.g. by precluding quoted strings, since we don't know what kind of mailbox names some hosts allow (like "a..b"@example.net or even "a b"@example.net).

I also do not recommend explicitly validating against a list of literal top-level domains or even imposing length-constraints (remember how ".museum" invalidated [a-z]{2,4}), but if you must:

([-!#-'*+/-9=?A-Z^-~]+(\.[-!#-'*+/-9=?A-Z^-~]+)*|"([]!#-[^-~ \t]|(\\[\t -~]))+")@([0-9A-Za-z]([0-9A-Za-z-]{0,61}[0-9A-Za-z])?\.)*(net|org|com|info|etc...)

Make sure to keep your regex up-to-date if you decide to go down the path of explicit top-level domain validation.

Further considerations

When only accepting host names in the domain part (after the @-sign), the regexes above accept only labels with at most 63 characters, as they should. However, they don't enforce the fact that the entire host name must be at most 253 characters long (including the dots). Although this constraint is strictly speaking still regular, it's not feasible to make a regex that incorporates this rule.

Another consideration, especially when using the regexes for input validation, is feedback to the user. If a user enters an incorrect address, it would be nice to give a little more feedback than a simple "syntactically incorrect address". With "vanilla" regexes this is not possible.

These two considerations could be addressed by parsing the address. The extra length constraint on host names could in some cases also be addressed by using an extra regex that checks it, and matching the address against both expressions.

None of the regexes in this answer are optimized for performance. If performance is an issue, you should see if (and how) the regex of your choice can be optimized.

  • RFC [6532](https://tools.ietf.org/html/rfc6532)更新[5322](https://tools.ietf.org/html/rfc5322)以允许并包含完整,干净的UTF-8.其他详细信息[此处](http://stackoverflow.com/a/31066998/2350426). (3认同)

Draemon.. 73

在网上有很多这样的例子(我认为即使是完全验证RFC的一个例子 - 但如果内存服务的话,它会长达数十/数百行).人们往往会因为验证这种事而被带走.为什么不检查它有@和至少一个.并满足一些简单的最小长度.无论如何,输入一个假的电子邮件并仍然匹配任何有效的正则表达式是微不足道的.我猜想误报比误报更好.

  • 一个 .不需要.TLD可以有电子邮件地址,也可以有IPv6地址 (14认同)

DOK.. 63

在决定允许哪些字符时,请记住您的撇号和带连字符的朋友.我无法控制我的公司使用人力资源系统中的姓名生成我的电子邮件地址.这包括我姓氏的撇号.由于我的电子邮件地址"无效",我无法告诉您有多少次我被禁止与网站互动.

  • 这是程序中一个超常见的问题,它对人名中的内容和不允许的内容做出了无根据的假设.一个人不应该做出这样的假设,只要接受相关RFC所说的必须的任何字符. (4认同)
  • 是.我特别激怒了程序员拒绝电子邮件地址中的大写字母!愚蠢和/或懒惰. (4认同)

Evan Carroll.. 63

这个正则表达式来自Perl的Email :: Valid库.我相信它是最准确的,它匹配所有822.而且,它基于O'Reilly书中的正则表达式:

使用Jeffrey Friedl在Mastering Regular Expressions(http://www.ora.com/catalog/regexp/)中的示例构建的 正则表达式.

$RFC822PAT = <<'EOF';
[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\
xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xf
f\n\015()]*)*\)[\040\t]*)*(?:(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\x
ff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|"[^\\\x80-\xff\n\015
"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[\040\t]*(?:\([^\\\x80-\
xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80
-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*
)*(?:\.[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\
\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\
x80-\xff\n\015()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x8
0-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|"[^\\\x80-\xff\n
\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[\040\t]*(?:\([^\\\x
80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^
\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040
\t]*)*)*@[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([
^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\
\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\
x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-
\xff\n\015\[\]]|\\[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()
]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\
x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:\.[\04
0\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\
n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\
015()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?!
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\
]]|\\[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\
x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\01
5()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*)*|(?:[^(\040)<>@,;:".
\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]
)|"[^\\\x80-\xff\n\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[^
()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]*(?:(?:\([^\\\x80-\xff\n\0
15()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][
^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)|"[^\\\x80-\xff\
n\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[^()<>@,;:".\\\[\]\
x80-\xff\000-\010\012-\037]*)*<[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?
:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-
\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:@[\040\t]*
(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015
()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()
]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\0
40)<>@,;:".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\
[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\
xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*
)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x80
-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x
80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t
]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\
\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])
*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x
80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80
-\xff\n\015()]*)*\)[\040\t]*)*)*(?:,[\040\t]*(?:\([^\\\x80-\xff\n\015(
)]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\
\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*@[\040\t
]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\0
15()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015
()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(
\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|
\\[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80
-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()
]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x
80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^
\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040
\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".
\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff
])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\
\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x
80-\xff\n\015()]*)*\)[\040\t]*)*)*)*:[\040\t]*(?:\([^\\\x80-\xff\n\015
()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\
\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*)?(?:[^
(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-
\037\x80-\xff])|"[^\\\x80-\xff\n\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\
n\015"]*)*")[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|
\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))
[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x80-\xff
\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\x
ff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(
?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\
000-\037\x80-\xff])|"[^\\\x80-\xff\n\015"]*(?:\\[^\x80-\xff][^\\\x80-\
xff\n\015"]*)*")[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\x
ff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)
*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*)*@[\040\t]*(?:\([^\\\x80-\x
ff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-
\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)
*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\
]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\]
)[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-
\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\x
ff\n\015()]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(
?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80
-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:[^(\040)<
>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x8
0-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\])[\040\t]*(?:
\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]
*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)
*\)[\040\t]*)*)*>)
EOF

  • O_O你还需要成为一名正则表达式大师才能理解它在做什么 (14认同)

SimonSimCity.. 45

当您使用PHP编写时,我建议您使用PHP内置验证来处理电子邮件.

filter_var($value, FILTER_VALIDATE_EMAIL)

如果你运行的是低于5.3.6的php版本,请注意这个问题:https://bugs.php.net/bug.php?id = 53091

如果您想了解有关此buid-in验证如何工作的更多信息,请参阅此处:PHP的filter_var FILTER_VALIDATE_EMAIL是否真的有效?


小智.. 43

Cal Henderson(Flickr)写了一篇名为Parsing Email Adresses in PHP的文章,展示了如何进行适当的RFC(2)822兼容的电子邮件地址解析.您还可以获得获得cc许可的php,python和ruby 的源代码.


Kon.. 42

我从不打扰使用我自己的正则表达式创建,因为很可能其他人已经提出了更好的版本.我总是使用regexlib找到一个我喜欢的.


PhiLho.. 37

没有一个真正可用的.
我在回答中讨论了一些问题是否有用于电子邮件地址验证的php库?,在Regexp中对电子邮件地址的识别很难讨论吗?

简而言之,不要指望单个可用的正则表达式做正确的工作.最好的正则表达式将验证语法,而不是电子邮件的有效性(jhohn@example.com是正确的,但它可能会反弹......).


spig.. 35

一个简单的正则表达式,至少不会拒绝任何有效的电子邮件地址,将检查一些东西,然后是@符号,然后是一个句点,后跟一个句点和至少2个东西.它不会拒绝任何内容,但在查看规范后,我找不到任何有效和拒绝的电子邮件.

电子邮件=〜 /.+@[^@]+\.[^@]{2,}$/

  • 这就是我想要的.不是很严格,但确保只有1 @(因为我们正在解析列表并希望确保没有丢失的逗号).仅供参考,你可以在左边有一个@,如果它在引号中:[Valid_email_addresses](http://en.wikipedia.org/wiki/Email_address#Valid_email_addresses),但它很漂亮. (3认同)
  • 使用它之后,意识到它没有完全正常工作.`/ ^ [^ @] + @ [^ @] + \.[^ @] {2} [^ @]*$ /`实际检查1 @符号.由于最后的.*,你的正则表达式会让多个通过. (2认同)

chaos.. 29

您可以使用jQuery Validation插件使用的插件:

/^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))@((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?$/i


Eric Schoono.. 25

有关验证电子邮件地址的最佳正则表达式的最全面评估,请参阅此链接; " 比较电子邮件地址验证正则表达式 "

以下是当前的顶级表达式,供参考:

/^([\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+\.)*[\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+@((((([a-z0-9]{1}[a-z0-9\-]{0,62}[a-z0-9]{1})|[a-z])\.)+[a-z]{2,6})|(\d{1,3}\.){3}\d{1,3}(\:\d{1,5})?)$/i


BalusC.. 23

更不用说在不久的将来允许非拉丁语(中文,阿拉伯语,希腊语,希伯来语,西里尔语等)域名.每个人都必须更改所使用的电子邮件正则表达式,因为这些字符肯定不会被[a-z]/i也包括在内\w.他们都会失败.

毕竟,验证电子邮件地址的最佳方法仍然是实际相关地址发送电子邮件以验证地址.如果电子邮件地址是用户身份验证(注册/登录/等)的一部分,那么您可以将其与用户激活系统完美地结合在一起.即发送一封电子邮件,其中包含指向电子邮件地址的唯一激活密钥的链接,并且仅当用户使用电子邮件中的链接激活新创建的帐户时才允许登录.

如果正则表达式的目的只是在UI中快速通知用户指定的电子邮件地址看起来不是正确的格式,那么最好仍然检查它是否基本匹配以下正则表达式:

^([^.@]+)(\.[^.@]+)*@([^.@]+\.)+([^.@]+)$

就那么简单.为什么你会关心名称和域中使用的字符?客户有责任输入有效的电子邮件地址,而不是服务器的电子邮件地址.即使客户端输入语法上有效的电子邮件地址aa@bb.cc,也不能保证它是合法的电子邮件地址.没有一个正则表达式可以涵盖这一点.

  • 我同意发送身份验证消息通常是这类内容的最佳方式,语法正确且有效并不相同.当我为"确认"两次输入我的电子邮件地址时,我感到很沮丧,好像我无法查看我输入的内容.我只将第一个复制到第二个,它似乎越来越多地被使用. (4认同)

Luna.. 18

HTML5规范提出用于验证电子邮件地址的简单的regex:

/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/

这故意不符合RFC 5322.

注:此要求是故意违反RFC 5322,它定义了电子邮件地址的语法,同时过于严格(在之前@的字符),太模糊了(后@字符),以及过于宽松(允许注释,空白字符,并且以大多数用户不熟悉的方式引用字符串)在这里具有实际用途.

根据RFC 3696勘误表1690,总长度也可以限制为254个字符.

  • 这不是最好的答案!这种模式匹配这个完全无效的地址:`invalid @ emailaddress`.在使用它之前,我会敦促谨慎和大量测试! (3认同)

Greg Bacon.. 15

为了进行生动的演示,以下怪物相当不错,但仍无法正确识别所有语法上有效的电子邮件地址:它可以识别最多四层深度的嵌套注释.

这是解析器的工作,但即使地址在语法上有效,它仍然可能无法交付.有时候你不得不求助于"嘿,你们所有人,看着我们!"的乡巴佬方法.

// derivative of work with the following copyright and license:
// Copyright (c) 2004 Casey West.  All rights reserved.
// This module is free software; you can redistribute it and/or
// modify it under the same terms as Perl itself.

// see http://search.cpan.org/~cwest/Email-Address-1.80/

private static string gibberish = @"
(?-xism:(?:(?-xism:(?-xism:(?-xism:(?-xism:(?-xism:(?-xism:\
s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^
\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))
|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))+)*\s*\)\s*)+
|\s+)*[^\x00-\x1F\x7F()<>\[\]:;@\,.<DQ>\s]+(?-xism:(?-xism:\
s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^
\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))
|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))+)*\s*\)\s*)+
|\s+)*)|(?-xism:(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(
?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?
:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x
0D]))|)+)*\s*\)\s*))+)*\s*\)\s*)+|\s+)*<DQ>(?-xism:(?-xism:[
^\\<DQ>])|(?-xism:\\(?-xism:[^\x0A\x0D])))+<DQ>(?-xism:(?-xi
sm:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xis
m:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\
]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))+)*\s*\)\
s*)+|\s+)*))+)?(?-xism:(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?
-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:
\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[
^\x0A\x0D]))|)+)*\s*\)\s*))+)*\s*\)\s*)+|\s+)*<(?-xism:(?-xi
sm:(?-xism:(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^(
)\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(
?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))
|)+)*\s*\)\s*))+)*\s*\)\s*)+|\s+)*(?-xism:[^\x00-\x1F\x7F()<
>\[\]:;@\,.<DQ>\s]+(?:\.[^\x00-\x1F\x7F()<>\[\]:;@\,.<DQ>\s]
+)*)(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))
|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:
(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\s
*\)\s*))+)*\s*\)\s*)+|\s+)*)|(?-xism:(?-xism:(?-xism:\s*\((?
:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x
0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xi
sm:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))+)*\s*\)\s*)+|\s+)*
<DQ>(?-xism:(?-xism:[^\\<DQ>])|(?-xism:\\(?-xism:[^\x0A\x0D]
)))+<DQ>(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\
]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-x
ism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+
)*\s*\)\s*))+)*\s*\)\s*)+|\s+)*))\@(?-xism:(?-xism:(?-xism:(
?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?
-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^
()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))+)*\s
*\)\s*)+|\s+)*(?-xism:[^\x00-\x1F\x7F()<>\[\]:;@\,.<DQ>\s]+(
?:\.[^\x00-\x1F\x7F()<>\[\]:;@\,.<DQ>\s]+)*)(?-xism:(?-xism:
\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[
^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+)
)|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))+)*\s*\)\s*)
+|\s+)*)|(?-xism:(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:
(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((
?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\
x0D]))|)+)*\s*\)\s*))+)*\s*\)\s*)+|\s+)*\[(?:\s*(?-xism:(?-x
ism:[^\[\]\\])|(?-xism:\\(?-xism:[^\x0A\x0D])))+)*\s*\](?-xi
sm:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:
\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(
?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))+
)*\s*\)\s*)+|\s+)*)))>(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-
xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\
s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^
\x0A\x0D]))|)+)*\s*\)\s*))+)*\s*\)\s*)+|\s+)*))|(?-xism:(?-x
ism:(?-xism:(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^
()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*
(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D])
)|)+)*\s*\)\s*))+)*\s*\)\s*)+|\s+)*(?-xism:[^\x00-\x1F\x7F()
<>\[\]:;@\,.<DQ>\s]+(?:\.[^\x00-\x1F\x7F()<>\[\]:;@\,.<DQ>\s
]+)*)(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+)
)|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism
:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\
s*\)\s*))+)*\s*\)\s*)+|\s+)*)|(?-xism:(?-xism:(?-xism:\s*\((
?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\
x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-x
ism:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))+)*\s*\)\s*)+|\s+)
*<DQ>(?-xism:(?-xism:[^\\<DQ>])|(?-xism:\\(?-xism:[^\x0A\x0D
])))+<DQ>(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\
\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-
xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|)
+)*\s*\)\s*))+)*\s*\)\s*)+|\s+)*))\@(?-xism:(?-xism:(?-xism:
(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(
?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[
^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))+)*\
s*\)\s*)+|\s+)*(?-xism:[^\x00-\x1F\x7F()<>\[\]:;@\,.<DQ>\s]+
(?:\.[^\x00-\x1F\x7F()<>\[\]:;@\,.<DQ>\s]+)*)(?-xism:(?-xism
:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:
[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+
))|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))+)*\s*\)\s*
)+|\s+)*)|(?-xism:(?-xism:(?-xism:\s*\((?:\s*(?-xism:(?-xism
:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\(
(?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A
\x0D]))|)+)*\s*\)\s*))+)*\s*\)\s*)+|\s+)*\[(?:\s*(?-xism:(?-
xism:[^\[\]\\])|(?-xism:\\(?-xism:[^\x0A\x0D])))+)*\s*\](?-x
ism:(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism
:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:\s*(?-xism:(?-xism:
(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|)+)*\s*\)\s*))
+)*\s*\)\s*)+|\s+)*))))(?-xism:\s*\((?:\s*(?-xism:(?-xism:(?
>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0D]))|(?-xism:\s*\((?:
\s*(?-xism:(?-xism:(?>[^()\\]+))|(?-xism:\\(?-xism:[^\x0A\x0
D]))|)+)*\s*\)\s*))+)*\s*\)\s*)*)"
  .Replace("<DQ>", "\"")
  .Replace("\t", "")
  .Replace(" ", "")
  .Replace("\r", "")
  .Replace("\n", "");

private static Regex mailbox =
  new Regex(gibberish, RegexOptions.ExplicitCapture); 


AZ_.. 12

根据官方标准RFC 2822,有效的电子邮件正则表达式是

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

如果你想在Java中使用它真的很容易

import java.util.regex.*;

class regexSample 
{
   public static void main(String args[]) 
   {
      //Input the string for validation
      String email = "xyz@hotmail.com";

      //Set the email pattern string
      Pattern p = Pattern.compile(" (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"
              +"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")"
                     + "@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\]");

      //Match the given string with the pattern
      Matcher m = p.matcher(email);

      //check whether match is found 
      boolean matchFound = m.matches();

      if (matchFound)
        System.out.println("Valid Email Id.");
      else
        System.out.println("Invalid Email Id.");
   }
}


MichaelRusht.. 11

RFC 5322标准:

允许点原子本地部分,引用字符串本地部分,过时(混合点原子和引用字符串)本地部分,域名域,(IPv4,IPv6和IPv4映射IPv6地址)域文字域,和(嵌套)CFWS.

'/^(?!(?>(?1)"?(?>\\\[ -~]|[^"])"?(?1)){255,})(?!(?>(?1)"?(?>\\\[ -~]|[^"])"?(?1)){65,}@)((?>(?>(?>((?>(?>(?>\x0D\x0A)?[\t ])+|(?>[\t ]*\x0D\x0A)?[\t ]+)?)(\((?>(?2)(?>[\x01-\x08\x0B\x0C\x0E-\'*-\[\]-\x7F]|\\\[\x00-\x7F]|(?3)))*(?2)\)))+(?2))|(?2))?)([!#-\'*+\/-9=?^-~-]+|"(?>(?2)(?>[\x01-\x08\x0B\x0C\x0E-!#-\[\]-\x7F]|\\\[\x00-\x7F]))*(?2)")(?>(?1)\.(?1)(?4))*(?1)@(?!(?1)[a-z0-9-]{64,})(?1)(?>([a-z0-9](?>[a-z0-9-]*[a-z0-9])?)(?>(?1)\.(?!(?1)[a-z0-9-]{64,})(?1)(?5)){0,126}|\[(?:(?>IPv6:(?>([a-f0-9]{1,4})(?>:(?6)){7}|(?!(?:.*[a-f0-9][:\]]){8,})((?6)(?>:(?6)){0,6})?::(?7)?))|(?>(?>IPv6:(?>(?6)(?>:(?6)){5}:|(?!(?:.*[a-f0-9]:){6,})(?8)?::(?>((?6)(?>:(?6)){0,4}):)?))?(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(?>\.(?9)){3}))\])(?1)$/isD'

RFC 5321标准:

允许点原子本地部分,带引号的字符串本地部分,域名域和(IPv4,IPv6和IPv4映射的IPv6地址)域文字域.

'/^(?!(?>"?(?>\\\[ -~]|[^"])"?){255,})(?!"?(?>\\\[ -~]|[^"]){65,}"?@)(?>([!#-\'*+\/-9=?^-~-]+)(?>\.(?1))*|"(?>[ !#-\[\]-~]|\\\[ -~])*")@(?!.*[^.]{64,})(?>([a-z0-9](?>[a-z0-9-]*[a-z0-9])?)(?>\.(?2)){0,126}|\[(?:(?>IPv6:(?>([a-f0-9]{1,4})(?>:(?3)){7}|(?!(?:.*[a-f0-9][:\]]){8,})((?3)(?>:(?3)){0,6})?::(?4)?))|(?>(?>IPv6:(?>(?3)(?>:(?3)){5}:|(?!(?:.*[a-f0-9]:){6,})(?5)?::(?>((?3)(?>:(?3)){0,4}):)?))?(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(?>\.(?6)){3}))\])$/iD'

基础:

允许点原子本地部分和域名域(要求至少两个域名标签,其TLD限制为2-6个字母字符).

"/^(?!.{255,})(?!.{65,}@)([!#-'*+\/-9=?^-~-]+)(?>\.(?1))*@(?!.*[^.]{64,})(?>[a-z0-9](?>[a-z0-9-]*[a-z0-9])?\.){1,126}[a-z]{2,6}$/iD"


Mac.. 11

这是我使用的PHP.我选择这个解决方案的精神是"假阳性比假阴性更好",这是另一位评论者在这里宣布的,关于保持你的响应时间和服务器负载下降...真的没有必要浪费服务器资源一个正则表达式,这将清除最简单的用户错误.如果需要,您可以随时发送测试电子邮件.

function validateEmail($email) {
  return (bool) stripos($email,'@');
}


小智.. 9

奇怪的是你"不能"允许4个字符的TLD.您禁止使用.info.name,以及长度限制停止.travel.museum,但是,它们不常见于2个字符的TLD和3个字符的TLD.

你也应该允许使用大写字母.电子邮件系统将规范化本地部分和域部分.

对于域名部分的正则表达式,域名不能以" - "开头,也不能以" - "结尾.短跑只能停留在两者之间.

如果您使用了PEAR库,请查看他们的邮件功能(忘记确切的名称/库).您可以通过调用一个函数来验证电子邮件地址,并根据RFC822中的定义验证电子邮件地址.

  • @Joseph Yee:RFC 822有点过时吗? (2认同)

小智.. 8

public bool ValidateEmail(string sEmail)
{
    if (sEmail == null)
    {
        return false;
    }

    int nFirstAT = sEmail.IndexOf('@');
    int nLastAT = sEmail.LastIndexOf('@');

    if ((nFirstAT > 0) && (nLastAT == nFirstAT) && (nFirstAT < (sEmail.Length - 1)))
    {
        return (Regex.IsMatch(sEmail, @"^[a-z|0-9|A-Z]*([_][a-z|0-9|A-Z]+)*([.][a-z|0-9|A-Z]+)*([.][a-z|0-9|A-Z]+)*(([_][a-z|0-9|A-Z]+)*)?@[a-z][a-z|0-9|A-Z]*\.([a-z][a-z|0-9|A-Z]*(\.[a-z][a-z|0-9|A-Z]*)?)$"));
    }
    else
    {
        return false;
    }
}


Prasad.. 8

如果您接受空值(这不是无效的电子邮件)并运行PHP 5.2+,我会建议:

static public function checkEmail($email, $ignore_empty = false) {
        if($ignore_empty && (is_null($email) || $email == ''))
                return true;
        return filter_var($email, FILTER_VALIDATE_EMAIL);
    }


TombMedia.. 7

我一直在使用你的正则表达式的这个修饰版本,它并没有给我留下太多的惊喜.我从未在电子邮件中遇到过撇号,因此它不会对此进行验证.它确实对这些非字母数字字符进行了验证Jean+François@anydomain.museum,?@??.??.????.???????但并非奇怪滥用.+@you.com.

(?!^[.+&'_-]*@.*$)(^[_\w\d+&'-]+(\.[_\w\d+&'-]*)*@[\w\d-]+(\.[\w\d-]+)*\.(([\d]{1,3})|([\w]{2,}))$)

它确实支持IP地址,you@192.168.1.1但我还没有完善它以处理伪造的IP范围,例如999.999.999.1.

它还支持超过3个字符的所有顶级域名asdf@asdf.asdf,我认为原始版本会停止. 我被打败了,现在有太多的tlds超过3个字符.

我知道acrosman已经放弃了他的正则表达,但这种味道依然存在.


Dimitris And.. 6

我不相信上面的bortzmeyer声称"语法(在RFC 5322中指定)太复杂了"(由正则表达式处理).

这是语法:(来自http://tools.ietf.org/html/rfc5322#section-3.4.1)

addr-spec       =   local-part "@" domain
local-part      =   dot-atom / quoted-string / obs-local-part
domain          =   dot-atom / domain-literal / obs-domain
domain-literal  =   [CFWS] "[" *([FWS] dtext) [FWS] "]" [CFWS]
dtext           =   %d33-90 /          ; Printable US-ASCII
                    %d94-126 /         ;  characters not including
                    obs-dtext          ;  "[", "]", or "\"

假设dot-atom,quoted-string,obs-local-part,obs-domain本身就是常规语言,这是一个非常简单的语法.只需将addr-spec生产中的本地部分和域替换为各自的作品,您就可以使用常规语言直接翻译成正则表达式.

  • 在开始做出假设之前,你应该调查CFWS.这是一场噩梦. (4认同)

auco.. 6

我知道这个问题是关于RegEx的,但是猜测所有阅读这些解决方案的开发人员中有90%都试图在浏览器中显示的HTML表单中验证电子邮件地址.

如果是这种情况,我建议您查看新的HTML5 <input type="email">表单元素:

HTML5:

 <input type="email" required />

CSS3:

 input:required {
      background-color: rgba(255,0,0,0.2);
 }

 input:focus:invalid { 
     box-shadow: 0 0 1em red;
     border-color: red;
 }

 input:focus:valid { 
     box-shadow: 0 0 1em green;
     border-color: green;
 }

http://jsfiddle.net/mYRe7/1

这有几个好处:

  1. 自动验证,无需自定义解决方案:简单易用
  2. 如果JS被禁用,没有JavaScript,没有问题
  3. 没有服务器必须为此计算任何东西
  4. 用户可以立即获得反馈
  5. 旧浏览器应自动回退到输入类型"文本"
  6. 移动浏览器可以显示专用键盘(@ -Keyboard)
  7. 使用CSS3,表单验证反馈非常简单

显而易见的缺点可能是缺少旧浏览器的验证,但这会随着时间的推移而改变.我更喜欢这个疯狂的RegEx杰作.

另见:


Coder12345.. 6

我使用多步验证.由于没有完美的方法来验证电子邮件地址,完美的无法制作,但至少你可以通知用户他/她做错了什么 - 这是我的方法

1)我首先使用非常基本的正则表达式进行验证,该正则表达式仅检查电子邮件是否包含一个@符号,并且在该符号之前或之后不是空白.例如/^[^@\s]+@[^@\s]+$/

2a)如果第一个验证器没有通过(并且对于大多数地址它应该虽然不完美),然后警告用户电子邮件无效并且不允许他/她继续输入

2b)如果通过,则验证更严格的正则表达式 - 这可能会禁止有效的电子邮件.如果未通过,则会向用户发出可能的错误警告,但允许继续.与步骤(1)不同,不允许用户继续使用,因为这是一个明显的错误.

换句话说,第一个自由验证只是为了消除明显的错误,它被视为"错误".人们键入一个空白地址,没有@符号的地址,依此类推.这应该被视为错误.第二个更严格但被视为"警告",并且允许用户继续输入,但警告至少检查他/她是否输入了有效条目.这里的关键是错误/警告方法 - 错误是99.99%的情况下不能有效的电子邮件.

当然,你可以调整是什么让第一个正则表达式更自由,第二个更严格.

根据您的需要,上述方法可能适合您.


cbp.. 5

几年来,我们已经成功地使用了http://www.aspnetmx.com/。您可以选择要验证的级别(例如,语法检查,检查域,MX记录或实际电子邮件)。

对于前端表单,我们通常会验证域是否存在并且语法正确,然后在执行批量邮寄之前进行更严格的验证以清理数据库。


Nazmul Hasan.. 5

这是电子邮件的正则表达之一

^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))@((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?$


Suhaib Janju.. 5

我总是使用下面的正则表达式来验证电子邮件地址.这是我见过的验证电子邮件地址的最佳正则表达式.

"\A(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)\Z";

我总是在我的Asp.NET代码中使用这个正则表达式,我对它非常满意.

使用此程序集引用

using System.Text.RegularExpressions;

并尝试以下代码,因为它很简单,并为您完成工作.

private bool IsValidEmail(string email) {
    bool isValid = false;
    const string pattern = @"\A(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)\Z";

    isValid = email != "" && Regex.IsMatch(email, pattern);

    // an alternative of the above line is also given and commented
    //
    //if (email == "") {
    //    isValid = false;
    //} else {
    //    // address provided so use the IsMatch Method
    //    // of the Regular Expression object
    //    isValid = Regex.IsMatch(email, pattern);
    //}
    return isValid;
}

此函数验证电子邮件字符串.如果电子邮件字符串为null,则返回false,如果电子邮件字符串格式不正确,则返回false.如果电子邮件的格式有效,则仅返回true.

  • 它适用于标准字符的标准电子邮件服务器.如果是非英语语言,则必须制作自己的定制ReGex. (3认同)

FlameStorm.. 5

对我来说,检查电子邮件的正确方法是:

  1. 检查符号@是否存在,并且在其前后有一些非@符号: /^[^@]+@[^@]+$/
  2. 尝试通过一些“激活码”向该地址发送电子邮件。
  3. 当用户“激活”他的电子邮件地址时,我们将看到一切正确。

当然,当用户键入“奇怪的”电子邮件时,您可以在前端显示一些警告或工具提示,以帮助他避免常见的错误,例如域部分中没有点或名称中没有引号的空格等。但是,如果用户确实需要,您必须接受地址“ hello @ world”。

另外,您还必须记住,电子邮件地址标准曾经而且可能会不断发展,因此您不能一劳永逸地键入某些“标准有效”的正则表达式。您必须记住,某些具体的Internet服务器可能会使通用标准的某些细节失效,并且实际上可以使用自己的“修改后的标准”。

因此,只需选中@,在前端提示用户并在给定地址上发送验证电子邮件。


Simon_Weaver.. 5

几乎所有我见过的RegEx - 包括微软使用的一些RegEx都不允许以下有效的电子邮件通过:simon-@hotmail.com

只是有一个真正的客户,这个格式的电子邮件地址无法下订单.

这就是我所确定的:

  • 一个没有假阴性的最小正则表达式.或者使用MailAddress构造函数和一些额外的检查(见下文):
  • 检查常见错别字.cmo.gmial.com要求确认Are you sure this is your correct email address. It looks like there may be a mistake.允许用户接受他们确定的输入内容.
  • 在实际发送电子邮件时处理退回并手动验证它们以检查明显的错误.

        try
        {
            var email = new MailAddress(str);

            if (email.Host.EndsWith(".cmo"))
            {
                return EmailValidation.PossibleTypo;
            }

            if (!email.Host.EndsWith(".") && email.Host.Contains("."))
            {
                return EmailValidation.OK;
            }
        }
        catch
        {
            return EmailValidation.Invalid;
        }


归档时间:

查看次数:

1164395 次

最近记录:

9 月 前