Ewo*_*ugz 0 python regex string
我将从一个更大的 CSV 文件中收集分散的电子邮件。我现在正在学习正则表达式。我试图从这个例句中提取电子邮件。但是,电子邮件仅填充了@ 符号和紧接其之前的字母。你能帮我看看出了什么问题吗?
import re
String = "'Jessica's email is jessica@gmail.com, and Daniel's email is daniel123@gmail.com. Edward's is edwardfountain@gmail.com, and his grandfather, Oscar's, is odawg@gmail.com.'"
emails = re.findall(r'.[@]', String)
names = re.findall(r'[A-Z][a-z]*',String)
print(emails)
print(names)
Run Code Online (Sandbox Code Playgroud)
您的正则表达式电子邮件根本不起作用:emails = re.findall(r'.[@]', String)匹配 anychar then @。
我会尝试不同的方法:匹配句子并提取名称,电子邮件加上以下经验假设(如果您的文本更改过多,则会破坏逻辑)
's"和is地方(使用非贪婪.*?匹配所有介于两者之间\w 匹配任何字母字符(或下划线),并且只匹配域的一个点(否则它匹配句子的最后一个点)代码:
import re
String = "'Jessica's email is jessica@gmail.com, and Daniel's email is daniel123@gmail.com. Edward's is edwardfountain@gmail.com, and his grandfather, Oscar's, is odawg@gmail.com.'"
print(re.findall("(\w+)'s.*? is (\w+@\w+\.\w+)",String))
Run Code Online (Sandbox Code Playgroud)
结果:
[('Jessica', 'jessica@gmail.com'), ('Daniel', 'daniel123@gmail.com'), ('Edward', 'edwardfountain@gmail.com'), ('Oscar', 'odawg@gmail.com')]
Run Code Online (Sandbox Code Playgroud)
转换为dict甚至会给你一个字典名称 => 地址:
{'Oscar': 'odawg@gmail.com', 'Jessica': 'jessica@gmail.com', 'Daniel': 'daniel123@gmail.com', 'Edward': 'edwardfountain@gmail.com'}
Run Code Online (Sandbox Code Playgroud)
一般情况需要更多字符(不确定我是否详尽无遗):
String = "'Jessica's email is jessica_123@gmail.com, and Daniel's email is daniel-123@gmail.com. Edward's is edward.fountain@gmail.com, and his grandfather, Oscar's, is odawg@gmail.com.'"
print(re.findall("(\w+)'s.*? is ([\w\-.]+@[\w\-.]+\.[\w\-]+)",String))
Run Code Online (Sandbox Code Playgroud)
结果:
[('Jessica', 'jessica_123@gmail.com'), ('Daniel', 'daniel-123@gmail.com'), ('Edward', 'edward.fountain@gmail.com'), ('Oscar', 'odawg@gmail.com')]
Run Code Online (Sandbox Code Playgroud)