正则表达式组捕获

Question

正则表达式组捕获

我有一封标准电子邮件，希望从中提取某些详细信息。

电子邮件中包括以下行：

<strong>Name:</strong> John Smith

Run Code Online (Sandbox Code Playgroud)

因此，为了模拟这一点，我使用以下JavaScript：

<strong>Name:</strong> John Smith

Run Code Online (Sandbox Code Playgroud)

这只会产生一个结果，即：

<strong>Name:</strong> John Smith

Run Code Online (Sandbox Code Playgroud)

我希望获得捕获组([^\<]*)，在这个示例中John Smith

我在这里想念什么？

Answer 1

T.J*_*der 5

在匹配数组中从索引1开始提供捕获组：

var str = "<br><strong>Name:</strong> John Smith<br>";
var re = /\<strong>Name\s*:\<\/strong>\s*([^\<]*)/g
match = re.exec(str);
while (match != null) {
    console.log(match[1]); // <====
    match = re.exec(str);
}

Run Code Online (Sandbox Code Playgroud)

索引0包含整个匹配项。

在现代JavaScript引擎上，您还可以使用命名捕获组（(?<theName>...)，您可以通过match.groups.theName以下方式进行访问：

var str = "<br><strong>Name:</strong> John Smith<br>";
var re = /\<strong>Name\s*:\<\/strong>\s*(?<name>[^\<]*)/g
// ---------------------------------------^^^^^^^
match = re.exec(str);
while (match != null) {
    console.log(match.groups.name); // <====
    match = re.exec(str);
}

Run Code Online (Sandbox Code Playgroud)

Answer 2

npi*_*nti 5

在正则表达式中，第一个匹配项始终是匹配的整个字符串。使用群组时，您将开始与群组1以及以后的群组进行匹配，因此要解决问题，只需将替换match[0]为即可match[1]。

话虽这么说，因为您正在使用JavaScript，所以与处理正则表达式HTML相比，处理DOM本身并从中提取文本会更好。

归档时间：	6 年，9 月前
查看次数：	45 次
最近记录：	6 年，9 月前