在 Python 中使用正则表达式捕获电子邮件

Question

在 Python 中使用正则表达式捕获电子邮件

我将从一个更大的 CSV 文件中收集分散的电子邮件。我现在正在学习正则表达式。我试图从这个例句中提取电子邮件。但是，电子邮件仅填充了@ 符号和紧接其之前的字母。你能帮我看看出了什么问题吗？

import re

String = "'Jessica's email is jessica@gmail.com, and Daniel's email is daniel123@gmail.com. Edward's is edwardfountain@gmail.com, and his grandfather, Oscar's, is odawg@gmail.com.'"

emails = re.findall(r'.[@]', String)
names = re.findall(r'[A-Z][a-z]*',String)

print(emails)
print(names)

Run Code Online (Sandbox Code Playgroud)

Answer 1

Jea*_*bre 5

您的正则表达式电子邮件根本不起作用：emails = re.findall(r'.[@]', String)匹配 anychar then @。

我会尝试不同的方法：匹配句子并提取名称，电子邮件加上以下经验假设（如果您的文本更改过多，则会破坏逻辑）

所有的名称，然后's"和is地方（使用非贪婪.*?匹配所有介于两者之间
\w 匹配任何字母字符（或下划线），并且只匹配域的一个点（否则它匹配句子的最后一个点）

代码：

import re

String = "'Jessica's email is jessica@gmail.com, and Daniel's email is daniel123@gmail.com. Edward's is edwardfountain@gmail.com, and his grandfather, Oscar's, is odawg@gmail.com.'"

print(re.findall("(\w+)'s.*? is (\w+@\w+\.\w+)",String))

Run Code Online (Sandbox Code Playgroud)

结果：

[('Jessica', 'jessica@gmail.com'), ('Daniel', 'daniel123@gmail.com'), ('Edward', 'edwardfountain@gmail.com'), ('Oscar', 'odawg@gmail.com')]

Run Code Online (Sandbox Code Playgroud)

转换为dict甚至会给你一个字典名称 => 地址：

{'Oscar': 'odawg@gmail.com', 'Jessica': 'jessica@gmail.com', 'Daniel': 'daniel123@gmail.com', 'Edward': 'edwardfountain@gmail.com'}

Run Code Online (Sandbox Code Playgroud)

一般情况需要更多字符（不确定我是否详尽无遗）：

String = "'Jessica's email is jessica_123@gmail.com, and Daniel's email is daniel-123@gmail.com. Edward's is edward.fountain@gmail.com, and his grandfather, Oscar's, is odawg@gmail.com.'"

print(re.findall("(\w+)'s.*? is ([\w\-.]+@[\w\-.]+\.[\w\-]+)",String))

Run Code Online (Sandbox Code Playgroud)

结果：

[('Jessica', 'jessica_123@gmail.com'), ('Daniel', 'daniel-123@gmail.com'), ('Edward', 'edward.fountain@gmail.com'), ('Oscar', 'odawg@gmail.com')]

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，10 月前
查看次数：	2513 次
最近记录：	8 年，2 月前