使用Python的正则表达式.match()方法获取下划线之前和之后的字符串

Question

使用Python的正则表达式.match()方法获取下划线之前和之后的字符串

Eri*_*and 4 python regex string iterator

我有以下代码:

tablesInDataset = ["henry_jones_12345678", "henry_jones", "henry_jones_123"]

for table in tablesInDataset:
    tableregex = re.compile("\d{8}")
    tablespec = re.match(tableregex, table)

    everythingbeforedigits = tablespec.group(0)
    digits = tablespec.group(1)

Run Code Online (Sandbox Code Playgroud)

我的正则表达式只应返回字符串,如果它在下划线后包含8位数.一旦它返回字符串,我想.match()使用该.group()方法获得两个组.第一组应包含一个字符串,将包含数字前的所有字符,第二组应包含一个包含8位数字符的字符串.

什么是正确的正则表达式来获得我寻找使用效果.match()和.group()？

Answer 1

wim*_*wim 5

使用捕获组:

>>> import re
>>> pat = re.compile(r'(?P<name>.*)_(?P<number>\d{8})')
>>> pat.findall(s)
[('henry_jones', '12345678')]

Run Code Online (Sandbox Code Playgroud)

如果需要,您可以获得命名组的优点:

>>> match = pat.match(s)
>>> match.groupdict()
{'name': 'henry_jones', 'number': '12345678'}

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，5 月前
查看次数：	123 次
最近记录：	7 年，7 月前